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) (54) Title: A HIGH-TROUGHPUT DIAGNOSTIC ASSAY FOR THE HUMAN VIRUS CAUSING SEVERE ACUTE RESPIRA- 
5 TORY SYNDROME (SARS) 



5 (57) Abstract: The present invention relates to a high-throughput diagnostic assay for the virus causing Severe Acute Respiratory 
■ Syndrome (SARS) in humans ("hSARS virus"). In particular, the invention relates to a high-throughput reverse transcription-PCR 

> diagnostic test for SARS associated coronavirus (SARS-CoV). The present assay is a rapid, reliable assay which can be used for 
5 diagnosis and monitoring the spread of SARS and is based on the nucleotide sequences of the N (nucleocapsid)-gene of the hSARS 
' virus. The present method eliminates false negative results and provides increased sensitivity for the assay. The invention also 
) discloses the S (spike)-gene of the hSARS virus. The invention further relates to the deduced amino acid sequences of the N-gene 

and S-gene products of the hSARS virus and to the use of the N-gene and S-gene products in diagnostic methods. The ir 

► further encompasses diagnostic assays and kits comprising antibodies generated against the N-gene or S-gene product. 
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A HIGH-THROUGHPUT DIAGNOSTIC ASSAY FOR 
THE HUMAN VI RUS CAUSING 

SEVERE ACUTE RESPIRATORY SYNDROME 
(SAMS) 



This application claims priority benefit to U.S. provisional application no. 
60/457,031, filed March 24, 2003; U.S. provisional application no. 60/457,730, filed March 
5 26, 2003; U.S. provisional application no. 60/459,93 1, filed April 2, 2003; U.S. provisional 
application no. 60/460,357, filed April 3, 2003; U.S. provisional application no. 60/461,265, 
filed April 8, 2003; U.S. provisional application no. 60/462,805, filed April 14, 2003; U.S. 
provisional application no. 60/464,886 filed April 23, 2003, U.S. provisional application no. 
60/465,738, filed April 25, 2003; and U.S. provisional application no. 60/470,935, filed 
10 May 14, 2003, each of which is incorporated herein by reference in its entirety. 

The instant application contains a lengthy Sequence listing which is being 
concurrently submitted via triplicate CD-R in lieu of a printed paper copy, and is hereby 
inconrporated by reference in its entirety. Said CD-R, recorded on March 22, 2004, are 
labeled "CRF", "Copy 1" and "Copy 2", respectively, and each contains only identical 1.6 
15 MB file (V9661077.APP). 

1. INTRODUCTION 

The present invention relates to a high-throughput diagnostic assay for the virus 
causing Severe Acute Respiratory Syndrome (SARS) in humans ("hSARS virus"). In 

20 particular, the invention relates to a high-throughput reverse transcription-PCR diagnostic 
test for SARS associated coronavirus (SARS-CoV). The present assay is a rapid, reliable 
assay which may be used for diagnosis and monitoring the spread of SARS. The present 
method eliminates false negative results and provides increased sensitivity for the assay. 
The invention further relates to nucleotide sequences and portions thereof, useful for the 

25 diagnosis of SARS. The invention further relates to nucleotide sequences and portions 
thereof, useful for assessing genetic diversity of SARS. The nucleotide sequences of the 
present invention comprise the (Nucleocapsid) N-gene and the S-gene sequences of the 
hSARS virus. The invention relates to a diagnostic kit that comprises nucleic acid 
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molecules for the detection of the N-gene or S-gene of hSARS virus. The invention also 
relates to the deduced amino acid sequences of the N-gene and S-gene products of the 
hSARS virus. The invention further relates to the use of the N-gene and S-gene products in 
diagnostic methods. The invention further encompasses diagnostic assays and kits 
5 comprising antibodies generated against the N-gene or S-gene product. 

2. BACKGROUND OF THE INVENTION 

Recently, there has been an outbreak of atypical pneumonia in Guangdong province 
in mainland China. Between November 2002 and March 2003, there were 792 reported 

1 0 cases with 3 1 fatalities (WHO. Severe Acute Respiratory Syndrome (S ARS) Weekly 
Epidemiol Rec. 2003; 78: 86). Patients with SARS show various clinical symptoms, 
including fever (of 38 degrees Celsius or above for over 24 hours), malaise, chills, headache 
and body ache. Chest X-rays show changes compatible with pneumonia. Other symptoms 
include coughing, shortness of breath or difficulty in breathing. By 3rd May 2003, a 

15 cumulative total number of 1621 cases and 179 deaths had been occurred in Hong Kong, 
which contributed to 26°/ 0 and 41% of the global reported cases (6234) and deaths (435) 
respectively. As the disease is highly contagious and spreads in daily-life activities, it is 
important to develop a rapid and reliable diagnosis test to monitor and control the disease. 
In response to this crisis, the Hospital Authority in Hong Kong has increased the 

20 surveillance on patients with severe atypical pneumonia. In the course of this investigation, 
a number of clusters of health care workers with the disease were identified. In addition, 
there were clusters of pneumonia incidents among persons in close contact with those 
infected. The disease was unusual in its severity and its progression in spite of the 
antibiotic treatment typical for the bacterial pathogens that are known to be commonly 

25 associated with atypical pneumonia. The present inventors were one of the groups involved 
in the investigation of these patients. All tests for identifying commonly recognized viruses 
and bacteria were negative in these patients. Furthermore, diagnostic tests for the detection 
of other genes in the hS ARS virus, such as the lb-gene are not useful to accurately diagnose 
SARS. The disease was given the acronym Severe Acute Respiratory Syndrome ("SARS"). 

30 This virus mutates and changes rapidly and hence the diagnostic of SARS was extremely 
difficult until the isolation of particular regions of the virus, the N-gene and S-gene, of the 
hSARS virus from the SARS patients by the present inventors as disclosed herein. Namely, 
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the present invention discloses a diagnostic assay using particular regions in the genome of 
the virus for rapid, accurate, reliable and specific identification of the hSARS virus. The 
invention is useful in both clinical and scientific research applications. Furthermore, the 
present invention provides a high-throughput assay which can be used as a sensitive method 
5 for diagnosis and monitoring the spread of the SARS. 

3. SUMMARY OF INVENTION 
The present invention is based upon the inventors' identification of a specific region 
of the hSARS virus, specifically, the 3' region of the hSARS viral genome, and in particular, 

1 0 the (nucleocapsid) N-gene of the hSARS virus that may be used in diagnostic assay to 

detect hSARS. In particular, the N-gene is useful for the diagnosis of SARS because the N- 
gene has the most abundant copy number during viral infection compared to any other gene 
in the hSARS virus, especially when the cells are lysed. Thus, the nucleic acid sequences of 
the N-gene of the hSARS virus are particularly useful in a rapid and reliable diagnostic 

1 5 assay for the hS ARS virus. Furthermore, the present method eliminates false negative 
results and increases the sensitivity of the assay. 

The hSARS virus was isolated from patients suffering from SARS in the recent 
outbreak of severe atypical pneumonia in China. The isolated virus is an enveloped, single- 
stranded RNA virus of positive polarity which belongs to the order, Nidovirales, of the 

20 family, Coronaviridae. The hSARS virus is a very large RNA virus consisting of 

approximately 29,742 nucleotides. The complete genomic sequence of the hSARS virus 
was deposited in Genbank, NCBI with Accession No: AY278491 (SEQ ID NO: 15), which 
is incorporated by reference. The isolated hSARS virus was deposited with China Center 
for Type Culture Collection (CCTCC) on April 2, 2003 and accorded an accession number, 

25 CCTCC-V200303, as described in Section 7, infra, which is incorporated by reference. 
Also, the entire genome sequence of the hSARS virus, CCTCC- V2003 03, and 
characterization thereof are disclosed in a United States Patent Application with Attorney 
Docket No. V966 1.0069 filed concurrently herewith on March 24, 2004, which is 
incorporated by reference in its entirety. The virus mutates and changes rapidly and hence 

30 making the diagnostic of SARS very difficult. The present inventors have designed a 
diagnostic assay for detecting the presence of N-gene nucleic acid sequence or protein to 
rapidly, accurately, and specifically identify the hSARS virus. Furthermore, the present 
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inventors have designed a diagnostic assay for detecting the presence of S-gene nucleic acid 
sequence or protein to determine the genetic diversity of the hSARS virus. Accordingly, the 
invention relates to methods of detecting nucleotide sequences of the N-gene and S-gene of 
the hSARS virus. 

5 The present invention provides a rapid, reliable assay for the detection of hSARS 

virus. In preferred embodiment, the detection of hSARS virus includes the use of the 
nucleic acid molecules of the present invention in a polymerase chain reaction, Reverse 
transcription-Polymerase Chain Reaction (RT-PCR), Southern analysis, Northern analysis, 
or other nucleic acid hybridization for the detection of hSARS nucleic acids. In one 

1 0 embodiment, the invention provides methods for detecting the presence, activity or 
expression of the hSARS vims of the invention in a biological material, such as cells, 
nasopharyngeal aspirate, sputum, blood, saliva, urine, stool and so forth. In preferred 
embodiments, the biological material is nasopharyngeal aspirate or stool. The increased or 
decreased activity or expression of the hSARS virus in a sample relative to a control sample 

15 can be determined by contacting the biological material with an agent which can detect 
directly or indirectly the presence, activity or expression of the hSARS virus. In a specific 
embodiment, the detecting agents are the nucleic acid molecules of the present invention. 

The present invention also relates to a method for identifying a subject infected with 
the hSARS virus, said method comprising obtaining total RNA from a biological sample 

20 obtained from the subject; reverse transcribing the total RNA to obtain cDNA; and 
subjecting the cDNA to PCR assay using a set of primers derived from a nucleotide 
sequence of the hSARS. In preferred embodiments, the primers are derived from the 
(nucleocapid) N-gene. In most preferred embodiments, the primers comprise the nucleotide 
sequences of SEQ ID NOS:2475 and/or 2476 or SEQ ID NOS:2480 and/or 2481. In 

25 another preferred embodiments, the primers are derived from the (spike) S-gene. In more 
preferred embodiments, the primers comprise the nucleotide sequences of SEQ ID 
NOS:2477 and/or 2478. 

The invention further relates to the use of the sequence information of the isolated 
virus for diagnostic and therapeutic methods. In a specific embodiment, the invention 

30 provides nucleic acid molecules which are suitable for use as primers consisting of or 
comprising the nucleotide sequence of SEQ ID NOT, 1 1, 13, 15, 2471, or 2473, or a 
complement thereof, or at least a portion of the nucleotide sequence thereof. In the most 
preferred embodiment, the primers comprise the nucleic acid sequence of SEQ ID 
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NOS:2475 and/or 2476 or SEQ ID NOS:24S0 and/or 2481 for the detection of N-gene. In 
another most preferred embodiment, the primers comprise the nucleic acid sequence of SEQ 
ID NO:2477 and/or 2478 for the detection of S-gene. In another specific embodiment, the 
invention provides nucleic acid molecules which are suitable for hybridization to hSARS 
5 nucleic acid, including, but not limited to, as PGR primers, Reverse Transcriptase primers, 
probes for Southern analysis or other nucleic acid hybridization analysis for the detection of 
hSARS nucleic acids, e.g., consisting of or comprising the nucleotide sequence of SEQ ID 
NO:l, 11, 13, 15, 2471, 2473, 2475, 2476, 2477, 2478, 2480 or 2481, or a complement 
thereof, or a portion thereof. In a preferred embodiment, primers that amplify fragments 

10 comprising (nucleotide position 18057 to 18222 or portions thereof of SEQ ID NO: 15) lb 
gene; (nucleotide position 2 1 920-22 1 07, or portions thereof of SEQ ID NO : 1 5) M-gene; 
and (nucleotide position 28658-28883, or portions thereof of SEQ ID NO: 1 5) N-gene may 
be used for probe synthesis for detection of hSARS nucleic acids. In a specific embodiment, 
the invention provides a diagnostic kit comprising nucleic acid molecules which are suitable 

1 5 for use to detect the N-gene of hS ARS . In a specific embodiment, the N-gene comprises 
nucleic acid sequence of SEQ ID NO: 2471. In specific embodiment, the nucleic acid 
molecules comprise nucleic acid sequence of SEQ ID NOS:2475 and/or 2476 or SEQ ID 
NOS:2480 and/or 2481. In preferred embodiments, the diagnostic kit further comprises a 
control for amplification of lb gene. In specific embodiments, the primers used for 

20 amplifying lb gene are SEQ ID NOS:3 and/or 4. In another specific embodiments, the 
diagnostic kit further comprises an internal control using pig P-actin gene. In specific 
embodiments, the primers used for amplifying P-actin gene are SEQ ID NOS:2482 and/or 
2483. 

In a specific embodiment, the invention provides a diagnostic kit comprising nucleic 
25 acid molecules which are suitable for use to detect the S-gene of hSARS. In a specific 

embodiment, the S-gene comprises nucleic acid sequence of SEQ ID NO: 2473. In specific 
embodiment, the nucleic acid molecules comprise nucleic acid sequence of SEQ ID NOS: 
2477 and/or 2478. The invention further encompasses chimeric or recombinant viruses 
encoded in whole or in part by said nucleotide sequences. 
30 In another specific embodiment, the invention provides nucleic acid molecules 

comprising a nucleotide sequence of SEQ ID NO: 1, 11,13, 2471, and/or 2473 . In a specific 
embodiment, the present invention provides isolated nucleic acid molecules comprising or, 
alternatively, consisting of the nucleotide sequence of SEQ ID NO: 1, a complement thereof 
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or a portion thereof, preferably at least 5, 10, 15, 20, 25, 30, 35, 40, 45, 100, 150, 200, 300, 
350, 400, 450, 500, 550, 600, or more contiguous nucleotides of the nucleotide sequence of 
SEQ ID NO:l, or a complement thereof. In another specific embodiment, the present 
invention provides isolated nucleic acid molecules comprising or, alternatively, consisting 
5 of the nucleotide sequence of SEQ ID NO: 1 1, a complement thereof or a portion thereof, 
preferably at least 5, 10, 15, 20, 25, 30, 35, 40, 45, 100, 150, 200, 300, 350, 400, 450, 500, 
550, 600, 650, 700, 750, 800, 850, 900, 950, 1,000, 1,050, 1,100, 1,150, 1,200, or more 
contiguous nucleotides of the nucleotide sequence of SEQ ID NO: 11, or a complement 
thereof. In yet another specific embodiment, the present invention provides isolated nucleic 

1 0 acid molecules comprising or, alternatively, consisting of the nucleotide sequence of SEQ 
ED NO: 13, a complement thereof or a portion thereof, preferably at least 5, 10, 15, 20, 25, 
30, 35, 40, 45, 100, 150, 200, 300, 350, 400, 450, 500, 550, 600, 650, 700, or more 
contiguous nucleotides of the nucleotide sequence of SEQ ED NO: 13, or a complement 
thereof. In another specific embodiment, the present invention provides isolated nucleic 

1 5 acid molecules comprising or, alternatively, consisting of the nucleotide sequence of SEQ 
ID NO: 15, a complement thereof or a portion thereof, preferably at least 5, 10, 15, 20, 25, 
30, 35, 40, 45, 100, 150, 200, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 
900, 950, 1,000, 1,050, 1,100, 1,150, 1,200, 2,000, 3,000, 4,000, 5,000, 6,000, 7,000, 8,000, 
9,000, 10,000, 11,000, 12,000, 13,000, 14,000, 15,000, 16,000, 17,000, 18,000, 19,000, 

20 20,000, 21,000, 22,000, 23,000, 24,000, 25,000, 26,000, 27,000, 28,000, 29,000 or more 
contiguous nucleotides of the nucleotide sequence of SEQ ED NO: 15, or a complement 
thereof. In another specific embodiment, the present invention provides isolated nucleic 
acid molecules comprising or, alternatively, consisting of the nucleotide sequence of SEQ 
ED NO:2471, a complement thereof or a portion thereof, preferably at least 5, 10, 15, 20, 25, 

25 30, 35, 40, 45, 100, 150, 200, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 
900, 950, 1,000, 1,050, 1,100, 1,150, 1,200 or more contiguous nucleotides of the 
nucleotide sequence of SEQ ED NO:2471, or a complement thereof. In another specific 
embodiment, the present invention provides isolated nucleic acid molecules comprising or, 
alternatively, consisting of the nucleotide sequence of SEQ ED NO:2473, a complement 

30 thereof or a portion thereof, preferably at least 5, 10, 15, 20, 25, 30, 35, 40, 45, 100, 150, 
200, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1,000, 1,050, 
1, 100, 1,150, 1,200, 2,000, 3,000, or more contiguous nucleotides of the nucleotide 
sequence of SEQ ED NO:2473, or a complement thereof. Furthermore, in another specific 
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embodiment, the invention provides isolated nucleic acid molecules which hybridize under 
stringent conditions, as defined herein, to a nucleic acid molecule having the sequence of 
SEQIDNO:l, 11, 13, 15, 2471, or 2473, or a complement thereof. In one embodiment, the 
invention provides an isolated nucleic acid molecule which is antisense to the coding strand 
5 of a nucleic acid of the invention. In another specific embodiment, the invention provides 
isolated polypeptides or proteins that are encoded by a nucleic acid molecule comprising or, 
alternatively consisting of a nucleotide sequence that is at least 5, 10, 15, 20, 25, 30, 35, 40, 
45, 100, 150, 200, 300, 350, 400, 450, 500, 550, 600, or more contiguous nucleotides of the 
nucleotide sequence of SEQ ID NO: 1, or a complement thereof. In yet another specific 

1 0 embodiment, the invention provides isolated polypeptides or proteins that are encoded by a 
nucleic acid molecule comprising or, alternatively consisting of a nucleotide sequence that 
is at least 5, 10, 15, 20, 25, 30, 35, 40, 45, 100, 150, 200, 300, 350, 400, 450, 500, 550, 600, 
650, 700, 750, 800, 850, 900, 950, 1,000, 1,050, 1,100, 1,150, 1,200 or more contiguous 
nucleotides of the nucleotide sequence of SEQ ID NO.l 1, or a complement thereof. In yet 

1 5 another specific embodiment, the invention provides isolated polypeptides or proteins that 
are encoded by a nucleic acid molecule comprising or, alternatively consisting of a 
nucleotide sequence that is at least 5, 10, 15, 20, 25, 30, 35, 40, 45, 100, 150, 200, 300, 350, 
400, 450, 500, 550, 600, 650, 700, or more contiguous nucleotides of the nucleotide 
sequence of SEQ ID NO: 13, or a complement thereof. In yet another specific embodiment, 

20 the invention provides isolated polypeptides or proteins that are encoded by a nucleic acid 
molecule comprising or, alternatively consisting of a nucleotide sequence that is at least 5, 
10, 15, 20, 25, 30, 35, 40, 45, 100, 150, 200, 300, 350, 400, 450, 500, 550, 600, 650, 700, 
750, 800, 850, 900, 950, 1,000, 1,050, 1,100, 1,150, 1,200, 2,000, 3,000, 4,000, 5,000, 6,000, 
7,000, 8,000, 9,000, 10,000, 11,000, 12,000, 13,000, 14,000, 15,000, 16,000, 17,000, 18,000, 

25 19,000, 20,000, 21,000, 22,000, 23,000, 24,000, 25,000, 26,000, 27,000, 28,000, 29,000 or 
more contiguous nucleotides of the nucleotide sequence of SEQ ID NO: 15, or a 
complement thereof. In yet another specific embodiment, the invention provides isolated 
polypeptides or proteins that are encoded by a nucleic acid molecule comprising or, 
alternatively consisting of a nucleotide sequence that is at least 5, 10, 15, 20, 25, 30, 35, 40, 

30 45, 100, 150, 200, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 

1,000, 1,050, 1,100, 1,150, 1,200 or more contiguous nucleotides of the nucleotide sequence 
of SEQ ID NO: 2471, or a complement thereof. In yet another specific embodiment, the 
invention provides isolated polypeptides or proteins that are encoded by a nucleic acid 
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molecule comprising or, alternatively consisting of a nucleotide sequence that is at least 5, 
10, 15, 20, 25, 30, 35, 40, 45, 100, 150, 200, 300, 350, 400, 450, 500, 550, 600, 650, 700, 
750, 800, 850, 900, 950, 1,000, 1,050, 1,100, 1,150, 1,200, 2,000, 3,000, or more 
contiguous nucleotides of the nucleotide sequence of SEQ ID NO:2473, or a complement 
5 thereof. The invention further provides proteins or polypeptides that are isolated from the 
hSARS virus, including viral proteins isolated from cells infected with the virus but not 
present in comparable uninfected cells. The invention further provides proteins or 
polypeptides shown in Figures 1 1 (SEQ ID NOS: 17-239, 241-736 and 738-1 107) and 12 
(SEQ ID NOS : 1 1 09- 1 5 89, 1 5 9 1 - 1 964 and 1 966-2470). The invention further provides 

10 proteins or polypeptides having the amino acid sequence of SEQ ID NO: 2472 or 2474. The 
polypeptides or the proteins of the present invention preferably have a biological activity of 
the protein (including antigenicity and/or immunogenicity) encoded by the sequence of 
SEQ ID NOT, 1 1, 13, 2471, or 2473. In other embodiments, the polypeptides or the 
proteins of the present invention have a biological acitivity of the protein (including 

1 5 antigenicity and/or immunogenicity) encoded by a nucleotide sequence that is at least 5, 10, 
15, 20, 25, 30, 35, 40, 45, 100, 150, 200, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 
800, 850, 900, 950, 1,000, 1,050, 1,100, 1,150, 1,200, 2,000, 3,000, 4,000, 5,000, 6,000, 
7,000, 8,000, 9,000, 10,000, 11,000, 12,000, 13,000, 14,000, 15,000, 16,000, 17,000, 18,000, 
19,000, 20,000, 21,000, 22,000, 23,000, 24,000, 25,000, 26,000, 27,000, 28,000, 29,000 or 

20 more contiguous nucleotides of the nucleotide sequence of SEQ ID NO: 15, or a 

complement thereof. In other embodiments, the polypeptides or the proteins of the present 
invention have a biological activity of the protein (including antigenicity and/or 
immunogenicity) of Figures 11 (SEQ ID NOS: 17-23 9, 241-736 and 738-1107) and 12 (SEQ 
ID NOS: 1 109-1589, 1591-1964 and 1966-2470). The invention further provides proteins or 

25 polypeptides having a biological activity of the protein having amino acid sequence of SEQ 
ID NO: 2472 or 2474. 

In one aspect, the invention provides a method for propagating the hSARS virus in 
host cells comprising infecting the host cells with the isolated hSARS virus, culturing the 
host cells to allow the virus to multiply, and harvesting the resulting virions. Also provide 

30 by the present invention are host cells that are infected with the hSARS virus. 

In one aspect, the invention relates to the use of the isolated hSARS virus for 
diagnostic and therapeutic methods. In a specific embodiment, the invention provides a 
method of detecting in a biological sample an antibody immunospecific for the hSARS 
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virus using the isolated hSARS virus or any proteins or polypeptides thereof. In another 
specific embodiment, the invention provides a method of screening for an antibody which 
immunospecifically binds and neutralizes hSARS. Such an antibody is useful for a passive 
immunization or immunotherapy of a subject infected with hSARS. 
5 The invention farther provides antibodies that specifically bind a polypeptide of the 

invention encoded by the nucleotide sequence ofSEQ ID NO: 1, 11, 13, 2471, or 2473, or a 
fragment thereof, or encoded by a nucleic acid comprising a nucleotide sequence that 
hybridizes under stringent conditions to the nucleotide sequence of SEQ ID NO.l, 11, 13, 
2471, or 2473, and/or any hSARS epitope, having one or more biological activities of a 

10 polypeptide of the invention. The invention further provides antibodies that specifically 
bind polypeptides of the invention encoded by the nucleotide sequence of SEQ ID NO: 1 5, 
or a fragment thereof. These polypeptides include those shown in Figures 1 1 (SEQ ID 
NOS: 17-239, 241-736 and 738-1107) and 12 (SEQ ID NOS: 1109-1 589, 1591-1964 and 
1966-2470). In another embodiment, the polypeptide comprises amino acid sequence of 

1 5 SEQ ID NO:2472, or 2474. The invention further provides antibodies that specifically bind 
polypeptides of the invention encoded by a nucleic acid comprising a nucleotide sequence 
that hybridizes under stringent conditions to the nucleotide sequence of SEQ ID NO: 15, 
and/or any hSARS epitope, having one or more biological activities of a polypeptide of the 
invention. Such antibodies include, but are not limited to polyclonal, monoclonal, bi- 

20 specific, multi-specific, human, humanized, chimeric antibodies, single chain antibodies, 
Fab fragments, F(ab') 2 fragments, disulfide-linked Fvs, intrabodies and fragments 
containing either a VI or VH domain or even a complementary determining region (CDR) 
that specifically binds to a polypeptide of the invention. 

In another embodiment, the invention provides vaccine preparations, comprising the 

25 hSARS virus, including recombinant and chimeric forms of said virus, or protein subunits 
of the virus. In a specific embodiment, the vaccine preparations of the present invention 
comprise live but attenuated hSARS virus with or without adjuvants. In another specific 
embodiment, the vaccine preparations of the invention comprise an inactivated or killed 
hSARS virus. Such attenuated or inactivated viruses may be prepared by a series of 

30 passages of the virus through the host cells or by preparing recombinant or chimeric forms 
of virus. Accordingly, the present invention further provides methods of preparing 
recombinant or chimeric forms of hSARS. In another specific invention, the vaccine 
preparations of the present invention comprise a nucleic acid or fragment of the hSARS 
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virus, e.g., the virus having accession no. CCTCC-V200303, or nucleic acid molecules 
having the sequence of SEQ ID NO. 1, 1 1, 13, 15, 2471 or 2473, or a fragment thereof. In 
another embodiment, the invention provides vaccine preparations comprising one or more 
polypeptides isolated from or produced from nucleic acid of hSARS virus, for example, of 
deposit accession no. CCTCC-V200303. In a specific embodiment, the vaccine 
preparations comprise a polypeptide of the invention encoded by the nucleotide sequence of 
SEQ ID NO: 1, 1 1, 13 . 2471 or 2473, or a fragment thereof. In a specific embodiment, the 
vaccine preparations comprise polypeptides of the invention as shown in Figures 1 1 (SEQ 
ID NOS: 17-239, 241-736 and 738-1 107) and 12 (SEQ ID NO: 1 109-1 589, 1591-1964 AND 
1966-2470) or encoded by the nucleotide sequence of SEQ ID NO: 15, or a fragment thereof. 
In a specific embodiment, the vaccine preparations comprise polypeptides comprising 
amino acid sequence of SEQ ID NO:2472 or 2474. Furthermore, the present invention 
provides methods for treating, ameliorating, managing or preventing SARS by 
administering the vaccine preparations or antibodies of the present invention alone or in 
combination with adjuvants, or other pharmaceutically acceptable excipients. 

In another aspect, the present invention provides pharmaceutical compositions 
comprising anti-viral agents of the present invention and a pharmaceutically acceptable 
carrier. In a specific embodiment, the anti-viral agent of the invention is an antibody that 
immunospecifically binds hSARS virus or any hSARS epitope. In preferred embodiments, 
such antibodies neutralize the hSARS virus. In a specific embodiment, the anti-viral agent 
of the invention binds a fragment, variant, homolog of N-gene or S-gene of hSARS virus. 
In a specific embodiment, the anti-viral agent of the invention binds a fragment, variant, 
homolog of a polypeptide comprising the amino acid sequence of SEQ ID NO: 2472 or 
2474. In another specific embodiment, the anti-viral agent is a polypeptide or protein of the 
present invention or nucleic acid molecule of the invention. The invention also provides 
kits containing a pharmaceutical composition of the present invention. 

3.1 Definitions 

The term "an antibody or an antibody fragment that immunospecifically binds a 
polypeptide of the invention" as used herein refers to an antibody or a fragment thereof that 
immunospecifically binds to the polypeptide encoded by the nucleotide sequence of SEQ ID 
NO: 1, 1 1, 13, 15, 2471 2473, or the polypeptides shown in Figures 1 1 and 12, or a fragment 
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thereof, and does not non-specifically bind to other polypeptides. An antibody or a 
fragment thereof that immunospecifically binds to the polypeptide of the invention may 
cross-react with other antigens. Preferably, an antibody or a fragment thereof that 
immunospecifically binds to a polypeptide of the invention does not cross-react with other 
5 antigens. An antibody or a fragment thereof that immunospecifically binds to the 

polypeptide of the invention, can be identified by, for example, immunoassays or other 
techniques known to those skilled in the art. 

An "isolated" or "purified" peptide or protein is substantially free of cellular material 
or other contaminating proteins from the cell or tissue source from which the protein is 

1 0 derived, or substantially free of chemical precursors or other chemicals when chemically 
synthesized. The language "substantially free of cellular material" includes preparations of 
a polypeptide/protein in which the polypeptide/protein is separated from cellular 
components of the cells from which it is isolated or recombinantly produced. Thus, a 
polypeptide/protein that is substantially free of cellular material includes preparations of the 

15 polypeptide/protein having less than about 30%, 20%, 10%, 5%, 2.5%, or 1%, (by dry 

weight) of contaminating protein. When the polypeptide/protein is recombinantly produced, 
it is also preferably substantially free of culture medium, i.e., culture medium represents 
less than about 20%, 10%, or 5% of the volume of the protein preparation. When 
polypeptide/protein is produced by chemical synthesis, it is preferably substantially free of 

20 chemical precursors or other chemicals, i.e., it is separated from chemical precursors or 
other chemicals which are involved in the synthesis of the protein. Accordingly, such 
preparations of the polypeptide/protein have less than about 30%, 20%, 10%, 5% (by dry 
weight) of chemical precursors or compounds other than polypeptide/protein fragment of 
interest. In a preferred embodiment of the present invention, polypeptides/proteins are 

25 isolated or purified. 

An "isolated" nucleic acid molecule is one which is separated from other nucleic 
acid molecules which are present in the natural source of the nucleic acid molecule. 
Moreover, an "isolated" nucleic acid molecule, such as a cDNA molecule, can be 
substantially free of other cellular material, or culture medium when produced by 

3 0 recombinant techniques, or substantially free of chemical precursors or other chemicals 
when chemically synthesized. In a preferred embodiment of the invention, nucleic acid 
molecules encoding polypeptides/proteins of the invention are isolated or purified. The 
term "isolated" nucleic acid molecule does not include a nucleic acid that is a member of a 
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library that has not been purified away from other library clones containing other nucleic 
acid molecules. 

The term "portion" or "fragment" as used herein refers to a fragment of a nucleic 
acid molecule containing at least about 25, 30, 35, 40, 45, 100, 150, 200, 250, 300, 350, 400, 
5 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1050, 1100, 1150, 1200, 1250, 
1300, 1350, 2,000, 3,000, 4,000, 5,000, 6,000, 7,000, 8,000, 9,000, 10,000, 11,000, 12,000, 
13,000, 14,000, 15,000, 16,000, 17,000, 18,000, 19,000, 20,000, 21,000, 22,000, 23,000, 
24,000, 25,000, 26,000, 27,000, 28,000, 29,000, or more contiguous nucleic acids in length 
of the relevant nucleic acid molecule and having at least one functional feature of the 

1 0 nucleic acid molecule (or the encoded protein has one functional feature of the protein 
encoded by the nucleic acid molecule); or a fragment of a protein or a polypeptide 
containing at least 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 90, 100, 120, 
140, 160, 180, 200, 220, 240, 260, 280, 300, 320, 340, 360, 400, 500, 600, 700, 800, 900, 
1,000, 1,500, 2,000, 2,500, 3,000, 3,500, 4,000, 4,100, 4,200, 4,300, 4,350, 4,360, 4,370, 

15 4,380 amino acid residues in length of the relevant protein or polypeptide and having at 
least one functional feature of the protein or polypeptide. 

The term "3' region of the hSAR viral genome" refers to from about nucleotide 
position 18,000 to 29742 of SEQ ID NO:15. 

The term "having a biological activity of the protein" or "having biological activities 

20 of the polypeptides of the invention" refers to the characteristics of the polypeptides or 
proteins having a common biological activity similar or identical structural domain and/or 
having sufficient amino acid identity to the polypeptide encoded by the nucleotide sequence 
of SEQ ID NO: 1, 11, 13, 15, 16,240,737, 1108, 1590, 1965,2471 or2473. Such common 
biological activities of the polypeptides of the invention include antigenicity and 

25 immunogenicity. 

The term "under stringent condition" refers to hybridization and washing conditions 
under which nucleotide sequences having at least 70%, at least 75%, at least 80%, at least 
85%, at least 90%, or at least 95% identity to each other remain hybridized to each other. 
Such hybridization conditions are described in, for example but not limited to, Current 

30 Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3. 1-6.3.6.; Basic 

Methods in Molecular Biology, Elsevier Science Publishing Co., Inc., N.Y. (1986), pp. 75- 
78, and 84-87; and Molecular Cloning, Cold Spring Harbor Laboratory, N.Y. (1982), pp. 
387-3 89, and are well known to those skilled in the art. A preferred, non-limiting example 
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of stringent hybridization conditions is hybridization in 6X sodium chloride/sodium citrate 
(SSC), 0.5% SDS at about 68°C followed by one or more washes in 2X SSC, 0.5% SDS at 
room temperature. Another preferred, non-limiting example of stringent hybridization 
conditions is hybridization in 6X SSC at about 45°C followed by one or more washes in 
5 0.2X SSC, 0. 1% SDS at about 50-65°C. 

The term "variant" as used herein refers either to a naturally occurring genetic 
mutant of hSARS or a recombinantly prepared variation of hSARS each of which contain 
one or more mutations in its genome compared to the hSARS of CCTCC-V200303. The 
term "variant" may also refers either to a naturally occurring variation of a given peptide or 
1 0 a recombinantly prepared variation of a given peptide or protein in which one or more 
amino acid residues have been modified by amino acid substitution, addition, or deletion. 

4. BRIEF DESCRIPTION OF THE FIGURES 

Figure 1 shows a partial DNA sequence (SEQ ID NO: 1) and its deduced amino acid 
1 5 sequence (SEQ ID NO: 2) obtained from the SARS virus that has 57% homology to the 
RNA-dependent RNA polymerase protein of known Coronaviruses. 

Figure 2 shows an electron micrograph of the novel hS ARS virus that has similar 
morphological characteristics of coronaviruses. 

Figure 3 shows an immunofluorescent staining for IgG antibodies that are 
20 specifically bound to the FrHK-4 cells infected with the novel human respiratory virus of 
Coronaviridae. 

Figure 4 shows an electron micrograph of ultra-centrifuged deposit of hSARS virus 
that was grown in the cell culture and negatively stained with 3% potassium phospho- 
tungstate atpH7.0. 

25 Figure 5 A shows a thin-section electron micrograph of lung biopsy of a patient with 

SARS; and Figure 5B shows a thin section electron micrograph of hSARS-infected cells. 

Figure 6 shows the result of phylogenetic analysis for the partial protein sequence 
(215 amino acids; SEQ ED NO: 2) of the hSARS virus (GenBank accession number 
AY268070). The phylogenetic tree is constructed by the neighbor-jointing method. The 

30 horizontal-line distance represents the number of sites at which the two sequences compared 
are different. Bootstrap values are deducted from 500 replicates. 
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Figure 7A shows an amplification plot of fluorescence intensity against the PGR 
cycle in a real-time quantitative PCR assay that can detect a hSARS virus in samples 
quantitatively. The copy numbers of input plasmid DNA in the reactions are indicated. The 
X-axis denotes the cycle number of a quantitative PCR assay and the Y-axis denotes the 
5 fluorescence intensity (FT) over the backgroud. Figure 7B shows the result of a melting 
curve analysis of PCR products from clinical samples. Signals from positive (+ve) samples, 
negative (-ve) samples and water control (water) are indicated. The X-axis denotes the 
temperature (°C) and the Y-axis denotes the fluorescence intensity (Fl) over the 
background. 

1 0 Figure 8 shows another partial DNA sequence (SEQ ID NO: 1 1) and its deduced 

amino acid sequence (SEQ ID NO: 12) obtained from the SARS virus. 

Figure 9 shows yet another partial DNA sequence (SEQ ID NO: 13) and its deduced 
amino acid sequence (SEQ ID NO: 14) obtained from the SARS virus. 

Figure 10 shows the entire genomic DNA sequence (SEQ ID NO: 15) of the SARS 

15 virus. 

Figure 11 shows the deduced amino acid sequences obtained from SEQ ID NO: 15 in 
three frames {see SEQ ED NOS: 16, 240 and 737). An asterisk (*) indicates a stop codon 
which marks the end of a peptide. The first-frame amino acid sequences: SEQ ID NOS: 17- 
239; the second-frame amino acid sequences: SEQ ID NOS:241-736; and the third-frame 
20 amino acid sequences: SEQ ID NO:73 8-1 107. 

Figure 12 shows the deduced amino acid sequences obtained from the complement 
of SEQ ID NO: 15 in three frames {see SEQ ID NOS: 1 108, 1 590 and 1965). An asterisk (*) 
indicates a stop codon which marks the end of a peptide. The first-frame amino acid 
sequences: SEQ ID NOS:1109-1589; the second-frame amino acid sequences: SEQ ID 
25 NOS:1591-1964; and the third-frame amino acid sequences: SEQ ID NO:1966-2470. 

Figure 13 shows the N-gene primer sequences (which amplifies nucleotides at 
position 29247-29410 of SEQ ID NO:2471). 150# (SEQ ID NO:2475); 200# (SEQ ID 
NO:2476); and S-gene primer sequences (which amplifies nucleotides at position 24751 to 
25049 of SEQ ID NO:2473). 13 1# (SEQ ID NO:2477); 132# (SEQ ID NO:2478). 
30 Figure 14A shows the nucleic acid sequence of the N-gene (SEQ ID NO: 2471). 

Figure 14B shows the amino acid sequence of the N-gene (SEQ ID NO:2472). 

Figure 15A shows the nucleic acid sequence of the S-gene (SEQ ID NO:2473). 
Figure 15B shows the amino acid sequence of the S-gene (SEQ ID NO:2474). 
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Figure 1 6 shows the genome organization and transcription strategy of SARS-CoV 
HK-39. Genomic and mRNA transcripts are capped (black circles), carry leader sequences 
(vertical lines)at 5' proximal and are polyadenylated (A 15 ). Arrows point the position of the 
intergenic sequence, 5 ' -CTA4ACGAAC-3 ' (SEQ ID N0 .2479). After release of the 
5 positive-sense genomic RNA in the cytoplasm of host cell, the viral RNA-dependent RNA 
polymerase, encoded from ORF la and lb, is synthesized. It carries out transcription of a 
full-length complementary (negative-sense) RNA, from which new genomic RNA, an 
overlapping set of subgenomic mRNA transcripts, and leader RNA are synthesized. Note 
that all transcripts are preceded with common 5' leader sequences and common 3' ends. 

1 0 ORF 1 a and 1 b - RNA-dependent RNA polymerase; S - the major peplomer glycoprotein; 
M - transmembrane glycoprotein; N - nucleocapsid; XI, X2, X3 - putative proteins. 

Figure 17 shows a construct map of pSARSCoV-ORFlb-N. PCR products 
amplified from ORF lb (lb) and N gene of SARS-CoV were co-ligated into a cloning vector 
pCR2. 1-TOPO (Invitrogen). The nucleotide (nt) numbers corresponds to the positions in 

1 5 the sequence of HK-3 9 strain SARS-CoV (AY27849 1). Shadowed areas indicate the 
amplicons by the primers used in diagnostic test (i.e., SEQ ID NOS:2480 and 2481). 

Figures 18 shows a photo of an agarose gel after electrophoresis of total RNA 
extracted from SARS patients using SV Total RNA isolation system. The extracted RNA 
was then subjected to a reverse-transcription pohymerase chain reaction (RT-PCR) assay for 

20 the detection of coronavirus in the patients. 

Figure 19 shows the effect of potential inhibitors in Reverse Transcription 
Polymerase Chain Reaction (RT-PCR). To remove potential inhibitors, total RNA eluted 
from SV96 Binding Plate was precipitated with 95 % ethanol and 3 M sodium acetate and 
resuspend in 12 u,l of nuclease-free water. RT-PCR was performed with actin-F (SEQ ID 

25 NO:2482) and actin-R (SEQ ID NO:2483) primers. Numbers indicated were the number of 
pig kidney epithelial (PK-15) cell added in the sample as an internal control. There was no 
DNA fragment amplified with untreated RNA samples. 

Figure 20 shows the primers used for amplifying various genes. SRS25 1 (SEQ ID 
NO:2480) and SRS252 (SEQ ID NO:2481) amplified a. 225 base pair fragment from the 

30 region of N-gene that showed no homology to other coronavirus. coro3 (SEQ ID NO: 3) 
and coro4 (SEQ ID NO: 4) amplified RNA-dependent RNA polymerase (lb gene) as a 
control. Actin-F (SEQ ID N0.2482) and actin-R (SEQ ID NO:2483) amplified a 745 base 
pair fragment from P-actin gene as internal control for PCR assays. 
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Figure 21 A shows Amplification plot of fluorescence Intensity against the number 
of PCR cycles. Black lines show the dynamic range of N-gene specific PCR with serially 
diluted plasmid construct from 10 1 to 10 6 copies. NPA samples from non-SARS patients, 
including patients suffering from adenovirus (n = 5), respiratory syncytial virus (n = 5), 
5 human metapneumovirus (n = 5), influenza A virus (n = 5), or influenza B virus (n = 5) 
infection, are shown in gray lines. Lines with triangles denotes the SARS-CoV positive 
NPA samples; NTC represents no template control; X-axis indicates the cycle number of 
quantitative PCR performed, while Y-axis represents the fluorescence intensity (FAM-400) 
over background signal (Delta Rn). Inlet shows the melting curve analysis of the PCR 

10 products. Signals from positive (+ve), negative (-ve) samples and no template control are 
indicated. X-axis indicates the temperature (°C), while Y-axis represents the fluorescence 
intensity (Delta Rn). Figure 21B shows comparison of dynamic ranges of N-gene and lb- 
gene specific PCRs. Dynamic ranges of both N-gene and lb-gene PCR were obtained with 
same plasmid construct in which 1:1 ratio of corresponding amplicons were subcloned. 

15 Serially diluted plasmid with copy number ranged from 10" 1 to 10 5 copies was used as 
template in both PCRs. Lines with triangles denotes N-gene specific PCR while the gray 
lines indicates lb-gene specific PCR. Inlet shows Ct values ± standard deviation in 
triplicate set of experiment of both PCRs with different copy numbers of template used. 
NTC represents no template control; X-axis indicated the cycle number of quantitative PCR 

20 performed, while Y-axis represents fluorescence intensity. 

Figure 22A and 22B show an amplification curve and a melting curve, respectively, of 
real-time quantitative PCR specific to lb (using the primers having SEQ ID NOS:3 and 4) and 
N gene (using the primers having SEQ ID NOS:2480 and 2481) of SARS CoV. Fig. 22A: 
Amplification plot of fluorescence intensity against the number of PCR cycles. One (1) ul 

25 of cDNA from a NPA, tracheal dispersion and lung biopsy of patients with clinical 

symptoms were used as template in each PCR. Fifty (50) cycles of PCR were performed to 
achieve the saturation phase of the reaction. X-axis indicates the cycle number of 
quantitative PCR performed, while y-axis represents the fluorescence intensity (FAM-490) 
over background signal. Horizontal gray line indicates the calculated threshold value by 

3 0 maximum curvature approach, and the baseline cycle Ct was calculated automatically. Inlet 
shows half-maximal fluorescence value (1/2 max) and Ct of both PCR with cDNA from 
various tissue isolated from a key patient (patient A indicated in New Engl. J. Med. 
348 : 1 967-76 (by Drosten et al, 2003) in three different time points. NPA = nasopharyngeal 



16 



WO 2004/085650 



PCT/CN2004/000246 



aspirate; TW = tracheal wash; LW = lung wash Fig. 22B: Melting curves of PGR products. 
Melting curve analysis was carried out after 10-minute further-extension step of the reaction. 
The temperature was raised from 56°C to 94°C by 76 increments of 0.5°C each, while each 
set-point temperature had been held for 7 seconds for data collection and analysis. Melting 
5 temperature of lb- and N-gene specific PGR products was 80.5°C and 85.5°C respectively. 
X-axis indicates the temperature in degree Celsius while Y-axis represents the fluorescence 
intensity (FAM-490) over background signal. One (1) ul of water was used as no template 
control in the reaction. 

Figure 23 shows the diagnostic result of 48 clinical samples using the primers 

10 having SEQ ID NOS:2480 and 2481, respectively, with j3-actin PGR as an internal control. 
Upper bands in each row showed a 745 bp DNA fragment amplified with actin-F and actin- 
R, while lower bands were the amplicons by primers specific for N-gene of SARS 
coronavirus (225 bp), -ve control (water) and +ve control (cDNA from SARS coronavirus 
infected vero cell) of the assay were indicated. Five (5) ul of PCR products of both 

1 5 reactions were mixed and loaded into the sample well in a 2 % agarose gel. M = 1 kb plus 
molecular marker (Invitrogen). 

Figure 24 shows Northern Blot analysis of SARS-CoV total RNA. Total RNA of 
SARS -Co V was extracted from SARS-CoV infected Vero E6 cell. RNA was separated in a 
1 % denaturing gel containing 6.29 % formaldehyde. Afterwards RNA was transferred to 

20 positively charged nylon membrane and hybridized with digoxigenin-labelled PCR 

fragments specific to lb, S, M and N genes, respectively. Lane 1 - lb; lane 2 - S; lane 3 - 
M; lane 4 - N. Vertical bar showed the molecular size reference. Arrows indicates the 
transcripts hybridized with N probe. Signals were analyzed by chemiluminescence. 

Figure 25 shows the DNA probes used in Nothern blot analysis. The probes for lb 

25 gene (nt 18057-18222; SEQ ID NO:2484), S gene (nt 21920-22107, SEQ ID NO:2485), M 
gene (nt 25867-26996; SEQ ID NO:2486), and N gene (nt 28658-28883; SEQ ID NO:2487) 
are shown. 

5. DETAILED DESCRIPTION OF THE INVENTION 

3 0 The present inventors developed a rapid, high-throughput reverse transcription-PCR 

diagnostic test for SARS associated coronavirus (SARS-CoV). 3' region of the hSARS 
virus genome including the Nucleocapsid gene (N-gene) represents a sensitive molecular 
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marker which can be used in addition to lb gene to increase the sensitivity of the test. An 
internal control using PK-15 cells may be employed to ensure the integrity of RNA during 
its extraction process and cDNA synthesis, thus eliminating false negative results. 

In mouse hepatitis virus (MHV), atypical member of the genus Coronavirus, both 
5 genomic RNA and mRNA transcripts are capped and with common 3 ' ends and common 
leader sequences on their 5' ends. With this unique transcription strategy, the copy numbers 
of different viral genes during proliferation of virus in its host are different (Figure 19). N 
gene that encodes for the nucleocapsid has the most abundant copy number during virus 
replication as all transcripts may carry nucleotide sequence from N gene, although they are 

10 not all in-frame for translation for this gene product. The present inventors have discovered 
a diagnostic assay that is based on the 3' region, including the N-gene, of the viral genome 
provide a more sensitive assay than the rest of the viral genome. Accordingly, in preferred 
embodiments, nucleic acid molecules that may be used for a diagnostic assay comprise 
nucleic acid sequence of nucleotide position 18000 to 29742 of SEQ ID NO: 15, or portions 

15 thereof. The portions may comprise 15, 20, 25, 30 , 35, 40, 45, 100, 150, 200, 300, 350, 
400, 450, 500, 550, 600, 650, 700, 750, 750, 800, 850, 900, 950, 1000, 1050, 1100, 1150, 
1200 of nucleotides having nucleic acid sequence from nucleotide position 18000 to 29742 
of SEQ ID NO: 15. In other preferred embodiments, nucleic acid molecules that may be 
used for a diagnostic assay comprising nucleic acid sequence of nucleotide position 28658 

20 to 28883 or 29247-29410 of SEQ ID NO: 15. 

Nasopharyngeal aspirate (NPA) and stool samples were obtained from SARS 
suspected patients with major clinical symptoms and significant history of close contact 
with infected patients. Total RNA was extracted from the subject samples, together with 
PK-15 cell as an internal control. Samples were analyzed by the reverse-transcription-PCR 

25 assay. Northern blot analysis was performed to show different subgenomic transcripts of 
the virus. Real-time quantitative PCR was employed to compare the sensitivity of two loci 
used in this diagnostic assay. In specific embodiments, PCR inhibitor was removed with 
ethanol precipitation after RNA extraction process. 

In preferred embodiments, the present invention provides a method for detecting the 

30 presence or absence of nucleic acid of the N-gene in a biological sample. The method 

involves obtaining a biological sample from various sources and contacting the sample with 
a compound or an agent capable of detecting a nucleic acid (e.g., mRNA, genomic RNA) of 
the N-gene of the hSARS virus such that the presence of the N-gene is detected in the 
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sample. In preferred embodiments, the N-gene may be detected using a labeled nucleic acid 
probe comprising of the nucleotide sequence of SEQ ID NO -.2471, complement thereof, or a 
portion thereof. The portion may be 10, 20, 30, 40, 50, 100, 200, 400, 500, 600, 800, 1000, 
1200 nucleotides in length. In preferred embodiments, primers comprising nucleotide 
5 sequence of SEQ ID NOS :2475 and/or 2476 or SEQ ID NOS:2480 and/or 248 1 may be 
used to amplify a portion of the N-gene for detection. 

A preferred agent for detecting hSARS mRNA or genomic RNA of the invention is 
a labeled nucleic acid probe capable of hybridizing to mRNA or genomic RNA encoding a 
polypeptide of the invention. The nucleic acid probe can be, for example, a nucleic acid 
10 molecule comprising or consisting of the nucleotide sequence or SEQ ID NOT, 1 1, 13, 15, 
2471 or 2473, complement thereof, or a portion thereof, such as an oligonucleotide of at 
least 15, 20, 25, 30, 50, 100, 250, 500, 750, 1,000 or more contiguous nucleotides in length 
and sufficient to specifically hybridize under stringent conditions to a hSARS mRNA or 
genomic RNA. 

1 5 In another preferred specific embodiment, the presence of N-gene is detected in the 

sample by an reverse transcription polymerase chain reaction (RT-PCR) using the primers 
that are constructed based on a partial nucleotide sequence of the N-gene or a genomic 
nucleic acid sequence of SEQ ID NO: 15, or based on a nucleotide sequence of SEQ ID 
NOT, 11, 13, 15, 2471, or 2473. In a non-limiting specific embodiment, preferred primers 

20 to be used in a RT-PCR method are: 5 '-TACACACCTCAGC-GTTG-3 ' (SEQ ID NO:3) 
and/or 5 ' -C ACGAACGTGACG-AAT-3 ' (SEQ ID NO:4), in the presence of 2.5 mM 
MgCL, and the thermal cycles are, for example, but not limited to, 94 °C for 8 min followed 
by 40 cycles of 94 °C for 1 min, 50 °C for 1 min, 72 °C for 1 min (also see Sections 6.7 
and 6.8 infra). In preferred embodiments, the primers comprise nucleic acid sequence of 

25 SEQ ID NOS: 2475 and 2476. In another preferred embodiment, the primers comprise 
nucleic acid sequence of SEQ ID NOS:2480 and 2481 . In preferred embodiments, the 
thermal cycles are 94 °C for 10 min followed by 40 cycles of 94 °C for 30 seconds, 56 °C 
for 30 seconds, 72 °C for 30 seconds, 72°C for 10 minutes. In another preferred 
embodiment, the thermal cycles are 94 °C for 3 min followed by 40 cycles of 94 °C for 30 

30 seconds, 56 °C for 30 seconds, 72 °C for 30 seconds, 72°C for 10 minutes. In more 

preferred specific embodiment, the present invention provides a real-time quantitative PGR 
assay to detect the presence of hSARS virus in a biological sample by subjecting the cDNA 
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obtained by reverse transcription of the extracted total RNA from the sample to PCR 
reactions using the specific primers, such as those having nucleotide sequences of SEQ ID 
NOS:3 and 4, and a fluorescence dye, such as SYBR® Green I, which fluoresces when 
bound non-specifically to double-stranded DNA. The fluorescence signals from these 
5 reactions are captured at the end of extension steps as PCR product is generated over a 

range of the thermal cycles, thereby allowing the quantitative determination of the viral load 
in the sample based on an amplification plot (see Section 6.7, infra). 

In the preferred embodiment, the present invention provides a method for detecting 
the presence or absence of nucleic acid of the S-gene in a biological sample. The method 

10 involves obtaining a biological sample from various sources and contacting the sample with 
a compound or an agent capable of detecting a nucleic acid (e.g., mRNA, genomic RNA) of 
the S-gene of the hS ARS virus such that the presence of the S-gene is detected in the 
sample. A preferred agent for detecting hSARS mRNA or genomic RNA of the invention is 
a labeled nucleic acid probe capable of hybridizing to mRNA or genomic RNA encoding a 

15 polypeptide of the invention. The nucleic acid probe can be, for example, a nucleic acid 
molecule comprising or consisting of the nucleotide sequence or SEQ ID NOT, 1 1, 13, 15, 
2471, or 2473, or a portion thereof, such as an oligonucleotide of at least 15, 20, 25, 30, 50, 
100, 250, 500, 750, 1,000 or more contiguous nucleotides in length and sufficient to 
specifically hybridize under stringent conditions to a hSARS mRNA or genomic RNA. 

20 In another preferred specific embodiment, the presence of S-gene is detected in the 

sample by an reverse transcription polymerase chain reaction (RT-PCR) using the primers 
that are constructed based on a partial nucleotide sequence of the S-gene (SEQ ID 
NO:2473). 

In vitro techniques for detection of mRNA include northern hybridizations, in situ 
25 hybridizations, RT-PCR, and RNase protection. In vitro techniques for detection of 
genomic RNA include nothern hybridizations, RT-PCT, and RNase protection. 

The polynucleotides encoding the N-gene may be amplified before they are detected. 
The term "amplified" refers to the process of making multiple copies of the nucleic acid 
from a single polynucleotide molecule. The amplification of polynucleotides can be carried 
30 out in vitro by biochemical processes known to those of skill in the art. The amplification 
agent may be any compound or system that will function to accomplish the synthesis of 
primer extension products, including enzymes. Suitable enzymes for this purpose include, 
for example, E. coli DNA polymerase I, Taq polymerase, Klenow fragment ofE. coli DNA 
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polymerase I, T4 DNA polymerase, other available DNA polymerases, polymerase muteins, 
reverse transcriptase, ligase, and other enzymes, including heat-stable enzymes (i.e., those 
enzymes that perform primer extension after being subjected to temperatures sufficiently 
elevated to cause denaturation). Suitable enzymes will facilitate combination of the 
5 nucleotides in the proper manner to form the primer extension products that are 

complementary to each mutant nucleotide strand. Generally, the synthesis will be initiated 
at the 3 '-end of each primer and proceed in the 5 '-direction along the template strand, until 
synthesis terminates, producing molecules of different lengths. There may be amplification 
agents, however, that initiate synthesis at the 5 '-end and proceed in the other direction, 

10 using the same process as described above. In any event, the method of the invention is not 
to be limited to the embodiments of amplification described herein. 

One method of in vitro amplification, which can be used according to this invention, 
is the polymerase chain reaction (PCR) described in U.S. Patent Nos. 4,683,202 and 
4,683, 195. The term "polymerase chain reaction" refers to a method for amplifying a DNA 

1 5 base sequence using a heat-stable DNA polymerase and two oligonucleotide primers, one 
complementary to the (+)-strand at one end of the sequence to be amplified and the other 
complementary to the (-)-strand at the other end. Because the newly synthesized DNA 
strands can subsequently serve as additional templates for the same primer sequences, 
successive rounds of primer annealing, strand elongation, and dissociation produce rapid 

20 and highly specific amplification of the desired sequence. The polymerase chain reaction is 
used to detect the presence of polynucleotides encoding cytokines in the sample. Many 
polymerase chain methods are known to those of skill in the art and may be used in the 
method of the invention. For example, DNA can be subjected to 30 to 35 cycles of 
amplification in a thermocycler as follows: 95°C for 30 sec, 52° to 60°C for 1 min, and 

25 72°C for 1 min, with a final extension step of 72°C for 5 min. For another example, DNA 
can be subjected to 35 polymerase chain reaction cycles in a thermocycler at a denaturing 
temperature of 95°C for 30 sec, followed by varying annealing temperatures ranging from 
54-58°C for 1 min, an extension step at 70°C for 1 min and a final extension step at 70°C. 
The primers for use in amplifying the N-gene or S-gene of the invention may be 

30 prepared using any suitable method, such as conventional phosphotri ester and 

phosphodiester methods or automated embodiments thereof so long as the primers are 
capable of hybridizing to the polynucleotides of interest. One method for synthesizing 
oligonucleotides on a modified solid support is described in U.S. Patent No. 4,458,066. The 
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exact length of primer will depend on many factors, including temperature, buffer, and 
nucleotide composition. The primer must prime the synthesis of extension products in the 
presence of the inducing agent for amplification. 

Primers used according to the method of the invention are complementary to each 
5 strand of nucleotide sequence to be amplified. The term "complementary" means that the 
primers must hybridize with their respective strands under conditions, which allow the 
agent for polymerization to function. In other words, the primers that are complementary to 
the flanking sequences hybridize with the flanking sequences and permit amplification of 
the nucleotide sequence. Preferably, the 3' terminus of the primer that is extended has 

10 perfectly base paired complementarity with the complementary flanking strand. Primers 
and probes for polynucleotides encoding N-gene or S-gene of the present invention can be 
developed using known methods combined with the present disclosure. 

Those of ordinary skill in the art will know of various amplification methodologies 
that can also be utilized to increase the copy number of target nucleic acid. The 

1 5 polynucleotides detected in the method of the invention can be further evaluated, detected, 
cloned, sequenced, and the like, either in solution or after binding to a solid support, by any 
method usually applied to the detection of a specific nucleic acid sequence such as another 
polymerase chain reaction, oligomer restriction (Saiki et al, Bio/Technology 3:1008-1012 
(1985)), allele-specific oligonucleotide (ASO) probe analysis (Conner etal, Proc. Natl 

20 Acad. Sci. USA 80: 278 (1983), oligonucleotide ligation assays (OLAs) (Landegren et al, 
Science 241:1011 (1988)), RNAse Protection Assay and the like. Molecular techniques for 
DNA analysis have been reviewed (Landegren et al, Science 242: 229-237 (1988)). 
Following DNA amplification, the reaction product may be detected by Southern blot 
analysis, without using radioactive probes. In such a process, for example, a small sample 

25 of DNA containing the polynucleotides obtained from the tissue or subject are amplified, 
and analyzed via a Southern blotting technique. The use of non-radioactive probes or labels 
is facilitated by the high level of the amplified signal. In one embodiment of the invention, 
one nucleoside triphosphate is radioactively labeled, thereby allowing direct visualization of 
the amplification product by autoradiography. In another embodiment, amplification 

30 primers are fluorescently labeled and run through an electrophoresis system. Visualization 
of amplified products is by laser detection followed by computer assisted graphic display, 
without a radioactive signal. 
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The methods of the present invention can involve a real-time quantitative PCR assay, 
such as a Taqman® assay (Holland et al., Proc Natl Acad Sci USA, 88(16):7276 (1991); 
also see U.S. Patent Application of Attorney Docket No. V966 1.0078 filed March 24, 2004, 
which is incorporated by reference in its entirety). The assays can be performed on an 
5 instrument designed to perform such assays, for example those available from Applied 
Biosy stems (Foster City, CA). Primers and probes for such an assay can be designed 
according to known procedures in the art. 

The size of the primers used to amplify a portion of the N-gene or S-gene is at least 
10, 15, 20, 25, 30 nucleotide in length. In particular, primers that amplify the N-gene or S- 
10 gene is most preferred. Preferably, the GC ratio should be above 30, 35, 40, 45, 50, 55, 60 
% so as to prevent hair-pin structure on the primer. Furthermore, the amplicon should be 
sufficiently long enough to be detected by standard molecular biology methodologies. 
Preferably, the amplicon is at least 40, 60, 100, 200, 300, 400, 500, 600, 800, 1000 base pair 
in length. 

15 In a specific embodiment, the methods further involve obtaining a control sample 

from a control subject, contacting the control sample with a compound or agent capable of 
detecting N-gene or S-gene, such that the presence of mRNA or genomic RNA encoding the 
N-gene or S-gene is detected in the sample, and comparing the presence (or absence) of N- 
gene or S-gene, or mRNA or genomic RNA encoding the polypeptide in the control sample 

20 with the presence of N-gene or S-gene, or mRNA or genomic DNA encoding the 
polypeptide in the test sample. 

The invention also encompasses kits for detecting the presence of N-gene nucleic 
acid in a test sample. The kit, for example, can comprise a labeled compound or agent 
capable of detecting a nucleic acid molecule encoding the polypeptide in a test sample and, 

25 in certain embodiments, a means for determining the amount of mRNA in the sample (an 
oligonucleotide probe which binds to DNA or mRNA encoding the polypeptide). 

For oligonucleotide-based kits, the kit can comprise, for example: (1) an 
oligonucleotide, e.g., a detectably labeled oligonucleotide, which hybridizes to a nucleic 
acid sequence encoding a polypeptide of the invention or to a sequence within the N-gene; 

30 (2) a pair of primers useful for amplifying a nucleic acid molecule containing the N-gene 
sequence. The kit can also comprise, e.g., a buffering agent, a preservative, or a protein 
stabilizing agent. The kit can also comprise components necessary for detecting the 
detectable agent (e.g., an enzyme or a substrate). The kit can also contain a control sample 
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or a series of control samples which can be assayed and compared to the test sample 
contained. Each component of the kit is usually enclosed within an individual container and 
all of the various containers are within a single package along with instructions for use. 

The present invention relates to the isolated N-gene and S-gene of the hSARS virus. 
5 In a specific embodiment, the virus comprises a nucleotide sequence of SEQ ID NO: 1, 1 1, 
13, 15, 2471, and/or 2473. In a specific embodiment, the present invention provides 
isolated nucleic acid molecules of the hSARS virus, comprising, or, alternatively, consisting 
of the nucleotide sequence of SEQ ID NO:l, 11, 13, 15, 2471, and/or, 2473, a complement 
thereof or a portion thereof. In another specific embodiment, the invention provides 

10 isolated nucleic acid molecules which hybridize under stringent conditions, as defined 

herein, to a nucleic acid molecule having the sequence of SEQ ID NO: 1, 11, 13, 15, 2471, 
and/or 2473, or specific genes of known member of Coronaviridae, or a complement 
thereof. In another specific embodiment, the invention provides isolated polypeptides or 
proteins that are encoded by a nucleic acid molecule comprising a nucleotide sequence that 

15 is at least about 5, 10, 15, 20, 25, 30, 35, 40, 45, 100, 150, 200, 300, 350, 400, 450, 500, 550, 
600, or more contiguous nucleotides of the nucleotide sequence of SEQ ID NO: 1, or a 
complement thereof. In another specific embodiment, the invention provides isolated 
polypeptides or proteins that are encoded by a nucleic acid molecule comprising a 
nucleotide sequence that is at least about 5, 10, 15, 20, 25, 30, 35, 40, 45, 100, 150, 200, 

20 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1,000, 1,050, 1,100, 
1,150, 1,200, or more contiguous nucleotides of the nucleotide sequence of SEQ ID NO:ll, 
or a complement thereof. In yet another specific embodiment, the invention provides 
isolated polypeptides or proteins that are encoded by a nucleic acid molecule comprising a 
nucleotide sequence that is at least about 5, 10, 15, 20, 25, 30, 35, 40, 45, 100, 150, 200, 

25 300, 350, 400, 450, 500, 550, 600, 650, 700, or more contiguous nucleotides of the 

nucleotide sequence of SEQ ID NO: 13, or a complement thereof. In yet another specific 
embodiment, the invention provides isolated polypeptides or proteins that are encoded by a 
nucleic acid molecule comprising or, alternatively consisting of a nucleotide sequence that 
is at least 5, 10, 15, 20, 25, 30, 35, 40, 45, 100, 150, 200, 300, 350, 400, 450, 500, 550, 600, 

30 650, 700, 750, 800, 850, 900, 950, 1,000, 1,050, 1,100, 1,150, 1,200 or more contiguous 
nucleotides of the nucleotide sequence of SEQ ID NO: 2471, or a complement thereof. In 
yet another specific embodiment, the invention provides isolated polypeptides or proteins 
that are encoded by a nucleic acid molecule comprising or, alternatively consisting of a 
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nucleotide sequence that is at least 5, 10, 15, 20, 25, 30, 35, 40, 45, 100, 150, 200, 300, 350, 
400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1,000, 1,050, 1,100, 1,150, 
1,200, 2,000, 3,000, or more contiguous nucleotides of the nucleotide sequence of SEQ ID 
NO: 2473, or a complement thereof. In yet another specific embodiment, the invention 
5 provides isolated polypeptides or proteins that are encoded by a nucleic acid molecule 

comprising or, alternatively consisting of a nucleotide sequence that is at least 5, 10, 15, 20, 
25, 30, 35, 40, 45, 100, 150, 200, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 
900, 950, 1,000, 1,050, 1,100, 1,150, 1,200, 2,000, 3,000, 4,000, 5,000, 6,000, 7,000, 8,000, 
9,000, 10,000, 11,000, 12,000, 13,000, 14,000, 15,000, 16,000, 17,000, 18,000, 19,000, 

10 20,000, 21,000, 22,000, 23,000, 24,000, 25,000, 26,000, 27,000, 28,000, 29,000 or more 
contiguous nucleotides of the nucleotide sequence of SEQ ED NO: 15, or a complement 
thereof. The polypeptides includes those shown in Figures 11 (SEQ ID NOS: 17-239, 241- 
736 and 738-1107) and 12 (SEQ ID NOS: 1109-1589, 1591-1964 and 1966-2470) or having 
an amino acid sequence of SEQ ED NO: 2472 or 2474. The polypeptides or the proteins of 

15 the present invention preferably have one or more biological activities of the proteins 
encoded by the sequence of SEQ ED NO:l, 11, 13, 15, 2471 or 2473, or the polypeptides 
shown in Figures 1 1 and 12, or the native viral proteins containing the amino acid 
sequences encoded by the sequence of SEQ ED NO: 1, 11, 13, 15, 2471 or 2473. 

The present invention also relates to a method for propagating the hSARS virus in 

20 host cells. 

The invention further relates to the use of the sequence information of the isolated 
virus for diagnostic and therapeutic methods. In a specific embodiment, the invention 
provides the entire nucleotide sequence of hSARS virus, CCTCC-V200303, SEQ ED NO: 15, 
or fragments, or complement thereof. Furthermore, the present invention relates to a 

25 nucleic acid molecule that hybridizes any portion of the genome of the hSARS virus, 
CCTCC-V200303, or SEQ ED NO: 15, under the stringent conditions. In a specific 
embodiment, the invention provides nucleic acid molecules which are suitable for use as 
primers consisting of or comprising the nucleotide sequence of SEQ ED NO: 1, 1 1, 13, 15, 
2471 or 2473, or a complement thereof, or a portion thereof. In specific embodiments, the 

30 primers comprise nucleotide sequence of SEQ ED NO: 2475, 2476, 2477, 2478, 2480 or 

2481. In another specific embodiment, the invention provides nucleic acid molecules which 
are suitable for use as hybridization probes for the detection of nucleic acids encoding a 
polypeptide of the invention, consisting of or comprising the nucleotide sequence of SEQ 



25 



WO 2004/085650 



PCT/CN2004/000246 



ID NO: 1, 1 1, 13, 15, 2471 or 2473, a complement thereof, or a portion thereof. The 
invention further relates to a kit comprising primers having nucleic acid sequence of SEQ 
ID NOS:2475 and 2476; and SEQ ID NOS:2480 and 2481, for the detection of N-gene. In 
another embodiment, the invention relates to a kit comprising primers having nucleic acid 
5 sequence of SEQ ID NOS :2477 and/or 2478 for the detection of S-gene. In a preferred 
embodiment, the kit further comprises reagents for the detection of genes not found in 
hSARS virus as a negative control. The invention further encompasses chimeric or 
recombinant viruses or viral proteins encoded by said nucleotide sequences. 

The invention further provides antibodies that specifically bind a polypeptide of the 

10 invention encoded by the nucleotide sequence of SEQ ID NO: 1, 1 1, 13, 2471 or 2473, or a 
fragment thereof, or any hSARS epitope. The invention further provides antibodies that 
specifically bind a polypeptide having amino acid sequence of SEQ ID NO: 2472 or 2474. 
The invention further provides antibodies that specifically bind the polypeptides of the 
invention encoded by the nucleotide sequence of SEQ ID NO: 15, or the polypeptides shown 

15 in Figures 1 1 and 12, or a fragment thereof, or any hSARS epitope. Such antibodies include, 
but are not limited to polyclonal, monoclonal, bi-specific, multi-specific, human, humanized, 
chimeric antibodies, single chain antibodies, Fab fragments, F(ab') 2 fragments, disulfide- 
linked Fvs, intrabodies and fragments containing either a VI or VH domain or even a 
complementary determining region (CDR) that specifically binds to a polypeptide of the 

20 invention. 

In one embodiment, the invention provides methods for detecting the presence, 
activity or expression of the hSARS virus of the invention in a biological material, such as 
cells, blood, saliva, urine, sputum, nasopharyngeal aspirates, and so forth. The presence of 
the hSARS virus in a sample can be determined by contacting the biological material with 

25 an agent which can detect directly or indirectly the presence of the hSARS virus. In a 
specific embodiment, the detection agents are the antibodies of the present invention. In 
another embodiment, the detection agent is a nucleic acid of the present invention. 

In another embodiment, the invention provides vaccine preparations comprising the 
hSARS virus, including recombinant and chimeric forms of said virus, or subunits of the 

30 virus. In a specific embodiment, the vaccine preparations comprise live but attenuated 

hSARS virus with or without pharmaceutically acceptable excipients, including adjuvants. 
In another specific embodiment, the vaccine preparations comprise an inactivated or killed 
hSARS virus with or without pharmaceutically acceptable excipients, including adjuvants. 
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The vaccine preparations of the present invention may further comprise with adjuvants 
orAccordingly, the present invention further provides methods of preparing recombinant or 
chimeric forms of hSARS. In another specific invention, the vaccine preparations of the 
present invention comprise one or more nucleic acid molecules comprising or consisting of 
5 the sequence of SEQ ID NO. 1, 1 1, 13, 15, 2471 and/or 2473, or a fragment thereof. In 
another embodiment, the invention provides vaccine preparations comprising one or more 
polypeptides of the invention encoded by a nucleotide sequence comprising or consisting of 
the nucleotide sequence of SEQ ID NO.l, 11, 13, 2471 and/or 2473, orthe polypeptides 
shown in Figures 11 and 12, or a fragment thereof. In another embodiment, the invention 

1 0 provides vaccine preparations comprising one or more polypeptides of the invention 

encoded by a nucleotide sequence comprising or consisting of the nucleotide sequence of 
SEQ ID NO: 15, or a fragment thereof. Furthermore, the present invention provides 
methods for treating, ameliorating, managing, or preventing SARS by administering the 
vaccine preparations or antibodies of the present invention alone or in combination with 

15 antivirals [e.g., amantadine, rimantadine, gancyclovir, acyclovir, ribavirin, penciclovir, 
oseltamivir, foscarnet zidovudine (AZT), didanosine (ddl), lamivudine (3TC), zalcitabine 
(ddC), stavudine (d4T), nevirapine, delavirdine, indinavir, ritonavir, vidarabine, nelfmavir, 
saquinavir, relenza, tamifiu, pleconaril, interferons, etc.], steroids and corticosteroids such 
as prednisone, cortisone, fluticasone and glucocorticoid, antibiotics, analgesics, 

20 broncho dialaters, or other treatments for respiratory and/or viral infections. 

Furthermore, the present invention provides pharmaceutical compositions 
comprising anti-viral agents of the present invention and a pharmaceutical^ acceptable 
carrier. The present invention also provides kits comprising pharmaceutical compositions 
of the present invention. 

25 In another aspect, the present invention provides methods for screening anti-viral 

agents that inhibit the mfectivity or replication of hSARS virus or variants thereof. 

5.1 Recombinant and Chimeric hSARS Viruses 

The present invention encompasses recombinant or chimeric viruses encoded by 
30 viral vectors derived from the genome of hSARS virus or natural variants thereof. In a 
specific embodiment, a recombinant virus is one derived from the hSARS virus of deposit 
accession no. CCTCC-V200303. In a specific embodiment, the virus has a nucleotide 
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sequence of SEQ ID NO: 15. In another specific embodiment, a recombinant virus is one 
derived from a natural variant of hSARS virus. A natural variant of hSARS has a sequence 
that is different from the genomic sequence (SEQ ID NO: 15) of the hSARS virus, CCTCC- 
V200303, due to one or more naturally occurred mutations, including, but not limited to, 
5 point mutations, rearrangements, insertions, deletions etc., to the genomic sequence that 
may or may not result in a phenotypic change. In accordance with the present invention, a 
viral vector which is derived from the genome of the hSARS virus, CCTCC-V200303, is 
one that contains a nucleic acid sequence that encodes at least a part of one ORF of the 
hSARS virus. In a specific embodiment, the ORF comprises or consists of a nucleotide 

10 sequence ofSEQ ID NO: 1, 11, 13, 2471, 2473, or a fragment thereof. In a specific 
embodiment, there are more than one ORF within the nucleotide sequence of SEQ ID 
NO: 1 5, as shown in Figures 1 1 (SEQ ID NOS; 16, 240 and 737) and 12 (SEQ ID NOS: 1 108, 
1590 and 1965), or a fragment thereof. In another embodiment, the polypeptide encoded by 
the ORF comprises or consists of an amino acid sequence of SEQ ID NO : 2, 12, 14, 2472, 

1 5 2474, or a fragment thereof, or shown in Figures 1 1 (SEQ ID NOS : 1 7-23 9, 24 1 -73 6 and 
738-1107) and 12 (SEQ ID NOS: 1109-1589, 1591-1964 and 1966-2470), orafragment 
thereof. In accordance with the present invention these viral vectors may or may not 
include nucleic acids that are non-native to the viral genome. 

In another specific embodiment, a chimeric virus of the invention is a recombinant 

20 hSARS virus which further comprises a heterologous nucleotide sequence. In accordance 
with the invention, a chimeric virus may be encoded by a nucleotide sequence in which 
heterologous nucleotide sequences have been added to the genome or in which endogenous 
or native nucleotide sequences have been replaced with heterologous nucleotide sequences. 
According to the present invention, the chimeric viruses are encoded by the viral 

25 vectors of the invention which further comprise a heterologous nucleotide sequence. In 

accordance with the present invention a chimeric virus is encoded by a viral vector that may 
or may not include nucleic acids that are non-native to the viral genome. In accordance 
with the invention a chimeric virus is encoded by a viral vector to which heterologous 
nucleotide sequences have been added, inserted or substituted for native or non-native 

30 sequences. In accordance with the present invention, the chimeric virus may be encoded by 
nucleotide sequences derived from different strains or variants of hSARS virus. In 
particular, the chimeric virus is encoded by nucleotide sequences that encode antigenic 
polypeptides derived from different strains or variants of hSARS virus. 
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A chimeric virus may be of particular use for the generation of recombinant vaccines 
protecting against two or more viruses (Tao et al, J. Virol. 72, 2955-2961; Durbin et al., 
2000, J.Virol. 74, 6821-6831; Skiadopoulos etal, 1998, J. Virol. 72, 1762-1768 (1998); 
Teng et al., 2000, J.Virol. 74, 93 17-9321). For example, it can be envisaged that a virus 
5 vector derived from the hSARS virus expressing one or more proteins of variants of hSARS 
virus, or vice versa, will protect a subject vaccinated with such vector against infections by 
both the native hSARS and the variant. Attenuated and replication-defective viruses may be 
of use for vaccination purposes with live vaccines as has been suggested for other viruses. 
(See, PCT WO 02/057302, at pp. 6 and 23, incorporated by reference herein). 

1 0 In accordance with the present invention the heterologous sequence to be 

incorporated into the viral vectors encoding the recombinant or chimeric viruses of the 
invention include sequences obtained or derived from different strains or variants of hSARS. 

In certain embodiments, the chimeric or recombinant viruses of the invention are 
encoded by viral vectors derived from viral genomes wherein one or more sequences, 

15 intergenic regions, termini sequences, or portions or entire ORF have been substituted with 
a heterologous or non-native sequence. In certain embodiments of the invention, the 
chimeric viruses of the invention are encoded by viral vectors derived from viral genomes 
wherein one or more heterologous sequences have been inserted or added to the vector. 

The selection of the viral vector may depend on the species of the subject that is to 

20 be treated or protected from a viral infection. If the subject is human, then an attenuated 
hSARS virus can be used to provide the antigenic sequences. 

In accordance with the present invention, the viral vectors can be engineered to 
provide antigenic sequences which confer protection against infection by the hSARS and 
natural variants thereof. The viral vectors may be engineered to provide one, two, three or 

25 more antigenic sequences. In accordance with the present invention the antigenic sequences 
may be derived from the same virus, from different strains or variants of the same type of 
virus, or from different viruses. 

The expression products and/or recombinant or chimeric virions obtained in 
accordance with the invention may advantageously be utilized in vaccine formulations. The 

30 expression products and chimeric virions of the present invention may be engineered to 
create vaccines against a broad range of pathogens, including viral and bacterial antigens, 
tumor antigens, allergen antigens, and auto antigens involved in autoimmune disorders. In 
particular, the chimeric virions of the present invention may be engineered to create 
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vaccines for the protection of a subject from infections with hSARS virus and variants 
thereof. 

In certain embodiments, the expression products and recombinant or chimeric 
virions of the present invention may be engineered to create vaccines against a broad range 
5 of pathogens, including viral antigens, tumor antigens and auto antigens involved in 

autoimmune disorders. One way to achieve this goal involves modifying existing hSARS 
genes to contain foreign sequences in their respective external domains. Where the 
heterologous sequences are epitopes or antigens of pathogens, these chimeric viruses may 
be used to induce a protective immune response against the disease agent from which these 

10 determinants are derived. 

Thus, the present hivention relates to the use of viral vectors and recombinant or 
chimeric viruses to formulate vaccines against a broad range of viruses and/or antigens. 
The present invention also encompasses recombinant viruses comprising a viral vector 
derived from the hS ARS or variants thereof which contains sequences which result in a 

15 virus having a phenotype more suitable for use in vaccine formulations, e.g., attenuated 
phenotype or enhanced antigenicity. The mutations and modifications can be in coding 
regions, in intergenic regions and in the leader and trailer sequences of the virus. 

The invention provides a host cell comprising a nucleic acid or a vector according to 
the invention. Plasmid or viral vectors containing the polymerase components of hSARS 

20 virus are generated in prokaryotic cells for the expression of the components in relevant cell 
types (bacteria, insect cells, eukaryotic cells). Plasmid or viral vectors containing 
full-length or partial copies of the hSARS genome will be generated in prokaryotic cells for 
the expression of viral nucleic acids in-vitro or in-vivo. The latter vectors may contain 
other viral sequences for the generation of chimeric viruses or chimeric virus proteins, may 

25 lack parts of the viral genome for the generation of replication defective virus, and may 
contain mutations, deletions or insertions for the generation of attenuated viruses. In 
addition, the present invention provides a host cell infected with hSARS virus, for example, 
of deposit no. CCTCC-V200303. 

Infectious copies of hS ARS (being wild type, attenuated, replication-defective or 

30 chimeric) can be produced upon co-expression of the polymerase components according to 
the state-of-the-art technologies described above. 

In addition, eukaryotic cells, transiently or stably expressing one or more foil-length 
or partial hSARS proteins can be used. Such cells can be made by transfection (proteins or 
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nucleic acid vectors), infection (viral vectors) or transduction (viral vectors) and may be 
useful for complementation of mentioned wild type, attenuated, replication-defective or 
chimeric viruses. 

The viral vectors and chimeric viruses of the present invention may be used to 
5 modulate a subject's immune system by stimulating a humoral immune response, a cellular 
immune response or by stimulating tolerance to an antigen. As used herein, a subject means: 
humans, primates, horses, cows, sheep, pigs, goats, dogs, cats, avian species and rodents. 

5.2 Formulation of Vaccines and Antivirals 

10 In a preferred embodiment, the invention provides a proteinaceous molecule or 

hSARS viais specific viral protein or functional fragment thereof encoded by a nucleic acid 
according to the invention. Useful proteinaceous molecules are for example derived from 
any of the genes or genomic fragments derivable from the virus according to the invention, 
including envelop protein (E protein), integral membrane protein (M protein), spike protein 

15 (S protein), nucleocapsid protein (N protein), hemaglutinin esterase (HE protein), and RNA- 
dependent RNA polymerase. Such molecules, or antigenic fragments thereof, as provided 
herein, are for example useful in diagnostic methods or kits and in pharmaceutical 
compositions such as subunit vaccines. Particularly useful are polypeptides encoded by the 
nucleotide sequence of SEQ ID NO: 1, 1 1, 13, 15, 2471, 2473, or as shown in Fig. 1 1 (SEQ 

20 ID NOS: 17-239, 241-736 and 738-1 107) and Fig. 12 (SEQ ID NOS: 1109-1589, 1591-1964 
and 1966-2470), or having the amino acid sequence of SEQ ID NO:2472 or 2474, or 
antigenic fragments thereof for inclusion as antigen or subunit immunogen, but inactivated 
whole virus can also be used. Particularly useful are also those proteinaceous substances 
that are encoded by recombinant nucleic acid fragments of the hSARS genome, of course 

25 preferred are those that are within the preferred bounds and metes of ORFs, in particular, for 
eliciting hSARS specific antibody or T cell responses, whether in vivo (e.g. for protective or 
therapeutic purposes or for providing diagnostic antibodies) or in vitro (e.g. by phage 
display technology or another technique useful for generating synthetic antibodies). 

The invention provides vaccine formulations for the prevention and treatment of 

3 0 infections with hSARS virus. In certain embodiments, the vaccine of the invention 

comprises recombinant and chimeric viruses of the hSARS virus. In certain embodiments, 
the virus is attenuated. 
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In another embodiment of this aspect of the invention, inactivated vaccine 
formulations may be prepared using conventional techniques to "kill" the chimeric viruses. 
Inactivated vaccines are "dead" in the sense that their infectivity has been destroyed. 
Ideally, the infectivity of the virus is destroyed without affecting its immunogenicity. In 
5 order to prepare inactivated vaccines, the chimeric virus may be grown in cell culture or in 
the allantois of the chick embryo, purified by zonal ultracentrifugation, inactivated by 
formaldehyde or P-propiolactone, and pooled. The resulting vaccine is usually inoculated 
intramuscularly. 

Inactivated viruses may be formulated with a suitable adjuvant in order to enhance 

1 0 the immunological response. Such adjuvants may include but are not limited to mineral 
gels, e.g., aluminum hydroxide; surface active substances such as lysolecithin, pluronic 
polyols, polyanions; peptides; oil emulsions; and potentially useful human adjuvants such as 
BCG and Corynebacterium parvum. 

In another aspect, the present invention also provides DNA vaccine formulations 

1 5 comprising a nucleic acid or fragment of the hSARS virus, e.g., the virus having accession 
no. CCTCC-V200303, or nucleic acid molecules having the sequence of SEQ ID NO: 1, 11, 
13, 15, 2471, 2473, or a fragment thereof. In another specific embodiment, the DNA 
vaccine formulations of the present invention comprises a nucleic acid or fragment thereof 
encoding the antibodies which immunospecifically binds hSARS viruses. In DNA vaccine 

20 formulations, a vaccine DNA comprises a viral vector, such as that derived from the hSARS 
virus, bacterial plasmid, or other expression vector, bearing an insert comprising a nucleic 
acid molecule of the present invention operably linked to one or more control elements, 
thereby allowing expression of the vaccinating proteins encoded by said nucleic acid 
molecule in a vaccinated subject. Such vectors can be prepared by recombinant DNA 

25 technology as recombinant or chimeric viral vectors carrying a nucleic acid molecule of the 
present invention {see also Section 5.1, supra). 

Various heterologous vectors are described for DNA vaccinations against viral 
infections. For example, the vectors described in the following references may be used to 
express hSARS sequences instead of the sequences of the viruses or other pathogens 

30 described; in particular, vectors described for hepatitis B virus (Michel, ML. et al, 1995, 
DAN-mediated immunization to the hepatitis B surface antigen in mice: Aspects of the 
humoral response mimic hepatitis B viral infection in humans, Proa Natl. Aca. Sci. USA 
92:5307-5311; Davis, H.L. etal, 1993, DNA-based immunization induces continuous 
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seretion of hepatitis B surface antigen and high levels of circulating antibody, Human Molec. 
Genetics 2:1847-1851), HIV virus (Wang, B. etal., 1993, Gene inoculation generates 
immune responses against human imunodeficiency virus type 1, Proc. Natl. Acad. Sci. USA 
90:4156-4160; Lu, S. et al, 1996, Simian immunodeficiency virus DNA vaccine trial in 
5 macques, J. Virol 70:3978-3991; Letvin, N.L. et al, 1997, Potent, protective anti-HIV 

immune responses generated by bimodal HIV envelope DNA plus protein vaccination, Proc 
Natl Acad Sci USA. 94(17):9378-83), and influenza viruses (Robinson, HI etal, 1993, 
Protection against a lethal influenza virus challenge by immunization with a 
haemagglutinin-expressing plasmid DNA, Vaccine 11:957-960; Ulmer, IB. etal, 

1 0 Heterologous protection against influenza by injection of DNA encoding a viral protein, 
Science 259:1745-1749), as well as bacterial infections, such as tuberculosis (Tascon, R.E. 
et al, 1996, Vaccination against tuberculosis by DNA injection, Nature Med. 2:888-892; 
Huygen, K. et al, 1996; Immunogenicity and protective efficacy of a tuberculosis DNA 
vaccine, Nature Med., 2:893-898), and parasitic infection, such as malaria (Sedegah, M., 

1 5 1 994, Protection against malaria by immunization with plasmid DNA encoding 

circumsporozoite protein, Proc. Natl. Acad. Sci. USA 91:9866-9870; Doolan, D.L. etal., 
1996, Circumventing genetic restriction of protection against malaria with multigene DNA 
immunization: CD8+ T cell-interferon 8, and nitric oxide-dependent immunity, J. Exper. 
Med, 1183:1739-1746). 

20 Many methods may be used to introduce the vaccine formulations described above. 

These include, but are not limited to, oral, intradermal, intramuscular, intraperitoneal, 
intravenous, subcutaneous, and intranasal routes. Alternatively, it may be preferable to 
introduce the chimeric virus vaccine formulation via the natural route of infection of the 
pathogen for which the vaccine is designed. The DNA vaccines of the present invention 

25 may be administered in saline solutions by injections into muscle or skin using a syringe 
and needle (Wolff J. A. et al, 1990, Direct gene transfer into mouse muscle in vivo, Science 
247:1465-1468; Raz, E., 1994, Intradermal gene immunization: The possible role of DNA 
uptake in the induction of cellular immunity to viruses, Proc. Natl. Acd. Sci. USA 91:9519- 
9523). Another way to administer DNA vaccines is called "gene gun" method, whereby 

30 microscopic gold beads coated with the DNA molecules of interest is fired into the cells 
(Tang, D. et al, 1992, Genetic immunization is a simple method for eliciting an immune 
response, Nature 356: 152-154). For general reviews of the methods for DNA vaccines, see 
Robinson, H.L., 1999, DNA vaccines: basic mechanism and immune responses (Review), 
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Int. J. Mol. Med, 4(5):549-555; Barber, B., 1997, Introduction: Emerging vaccine strategies, 
Seminars in Immunology 9(5):269-270; and Robinson, H.L. era/., 1997, DNA vaccines, 
Seminars in Immunology 9(5):27 1-283. 

5 5.3 Attenuation of hSAES Virus or Variants Thereof 

The hSARS virus or variants thereof of the invention can be genetically engineered 
to exhibit an attenuated phenotype. In particular, the viruses of the invention exhibit an 
attenuated phenotype in a subject to which the virus is administered as a vaccine. 
Attenuation can be achieved by any method known to a skilled artisan. Without being 

1 0 bound by theory, the attenuated phenotype of the viruses of the invention can be caused, e.g. , 
by using a virus that naturally does not replicate well in an intended host species, for 
example, by reduced replication of the viral genome, by reduced ability of the virus to infect 
a host cell, or by reduced ability of the viral proteins to assemble to an infectious viral 
particle relative to the wild type strain of the virus. 

1 5 The attenuated phenotypes of hSARS virus or variants thereof can be tested by any 

method known to the artisan. A candidate virus can, for example, be tested for its ability to 
infect a host or for the rate of replication in a cell culture system. In certain embodiments, 
growth curves at different temperatures are used to test the attenuated phenotype of the 
virus. For example, an attenuated virus is able to grow at 35"C, but not at 39°C or 40°C. In 

20 certain embodiments, different cell lines can be used to evaluate the attenuated phenotype of 
the virus. For example, an attenuated virus may only be able to grow in monkey cell lines 
but not the human cell lines, or the achievable virus titers in different cell lines are different 
for the attenuated virus. In certain embodiments, viral replication in the respiratory tract of 
a small animal model, including but not Umited to, hamsters, cotton rats, mice and guinea 

25 pigs, is used to evaluate the attenuated phenotypes of the virus. In other embodiments, the 
immune response induced by the virus, including but not limited to, the antibody titers (e.g., 
assayed by plaque reduction neutralization assay or ELISA) is used to evaluate the 
attenuated phenotypes of the virus. In a specific embodiment, the plaque reduction 
neutralization assay or ELISA is carried out at a low dose. In certain embodiments, the 

30 ability of the hSARS virus to elicit pathological symptoms in an animal model can be tested. 
A reduced ability of the virus to elicit pathological symptoms in an animal model system is 
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indicative of its attenuated phenotype. In a specific embodiment, the candidate viruses are 
tested in a monkey model for nasal infection, indicated by mucous production. 

The viruses of the invention can be attenuated such that one or more of the 
functional characteristics of the virus are impaired. In certain embodiments, attenuation is 
5 measured in comparison to the wild type strain of the virus from which the attenuated virus 
is derived. In other embodiments, attenuation is determined by comparing the growth of an 
attenuated virus in different host systems. Thus, for a non-limiting example, hSARS virus 
or a variant thereof is said to be attenuated when grown in a human host if the growth of the 
hSARS or variant thereof in the human host is reduced compared to the non-attenuated 

1 0 hSARS or variant thereof. 

In certain embodiments, the attenuated virus of the invention is capable of infecting 
a host, is capable of replicating in a host such that infectious viral particles are produced. In 
comparison to the wild type strain, however, the attenuated strain grows to lower titers or 
grows more slowly. Any technique known to the skilled artisan can be used to determine 

1 5 the growth curve of the attenuated virus and compare it to the growth curve of the wild type 
virus. 

In certain embodiments, the attenuated virus of the invention (e.g., a recombinant or 
chimeric hSARS) cannot replicate in human cells as well as the wild type virus (e.g., wild 
type hSARS) does. However, the attenuated virus can replicate well in a cell line that lack 

20 interferon functions, such as Vero cells. 

In other embodiments, the attenuated virus of the invention is capable of infecting a 
host, of replicating in the host, and of causing proteins of the virus of the invention to be 
inserted into the cytoplasmic membrane, but the attenuated virus does not cause the host to 
produce new infectious viral particles. In certain embodiments, the attenuated virus infects 

25 the host, replicates in the host, and causes viral proteins to be inserted in the cytoplasmic 
membrane of the host with the same efficiency as the wild type hSARS. In other 
embodiments, the ability of the attenuated virus to cause viral proteins to be inserted into 
the cytoplasmic membrane into the host cell is reduced compared to the wild type virus. In 
certain embodiments, the ability of the attenuated hSARS virus to replicate in the host is 

3 0 reduced compared to the wild type virus. Any technique known to the skilled artisan can be 
used to determine whether a virus is capable of infecting a mammalian cell, of replicating 
within the host, and of causing viral proteins to be inserted into the cytoplasmic membrane 
of the host. 
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In certain embodiments, the attenuated virus of the invention is capable of infecting 
a host. In contrast to the wild type hSARS, however, the attenuated hSARS cannot be 
replicated in the host. In a specific embodiment, the attenuated hSARS virus can infect a 
host and can cause the host to insert viral proteins in its cytoplasmic membranes, but the 
5 attenuated virus is incapable of being replicated in the host. Any method known to the 
skilled artisan can be used to test whether the attenuated hSARS has infected the host and 
has caused the host to insert viral proteins in its cytoplasmic membranes. 

In certain embodiments, the ability of the attenuated virus to infect a host is reduced 
compared to the ability of the wild type virus to infect the same host. Any technique known 

1 0 to the skilled artisan can be used to determine whether a virus is capable of infecting a host. 

In certain embodiments, mutations (e.g., missense mutations) are introduced into the 
genome of the virus, for example, into the sequence of SEQ ID NO:l, 11, 13, 15, 2471 or 
2473, or to generate a virus with an attenuated phenotype. Mutations (e.g., missense 
mutations) can be introduced into the structural genes and/or regulatory genes of the hSARS. 

15 Mutations can be additions, substitutions, deletions, or combinations thereof. Such variant 
of hS ARS can be screened for a predicted functionality, such as infectivity, replication 
ability, protein synthesis ability, assembling ability, as well as cytopathic effect in cell 
cultures. In a specific embodiment, the missense mutation is a cold-sensitive mutation. In 
another embodiment, the missense mutation is a heat-sensitive mutation. In another 

20 embodiment, the missense mutation prevents a normal processing or cleavage of the viral 
proteins. 

In other embodiments, deletions are introduced into the genome of the hSARS virus, 
which result in the attenuation of the virus. 

In certain embodiments, attenuation of the virus is achieved by replacing a gene of 
25 the wild type virus with a gene of a virus of a different species, of a different subgroup, or 
of a different variant. In another aspect, attenuation of the virus is achieved by replacing 
one or more specific domains of a protein of the wild type virus with domains derived from 
the corresponding protein of a virus of a different species. In certain other embodiments, 
attenuation of the virus is achieved by deleting one or more specific domains of a protein of 
30 the wild type virus. 

When a live attenuated vaccine is used, its safety must also be considered. The 
vaccine must not cause disease. Any techniques known in the art that can make a vaccine 
safe may be used in the present invention. In addition to attenuation techniques, other 
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techniques may be used. One non-limiting example is to use a soluble heterologous gene 
that cannot be incorporated into the virion membrane. For example, a single copy of the 
soluble version of a viral transmembrane protein lacking the transmembrane and cytosolic 
domains thereof can be used. 
5 Various assays can be used to test the safety of a vaccine. For example, sucrose 

gradients and neutralization assays can be used to test the safety. A sucrose gradient assay 
can be used to determine whether a heterologous protein is inserted in a virion. If the 
heterologous protein is inserted in the virion, the virion should be tested for its ability to 
cause symptoms in an appropriate animal model since the virus may have acquired new, 
10 possibly pathological, properties. 

5.4 Adjuvants and Carrier Molecules 

hSARS-associated antigens are administered with one or more adjuvants. In one 
embodiment, the hSARS-associated antigen is administered together with a mineral salt 
15 adjuvants or mineral salt gel adjuvant. Such mineral salt and mineral salt gel adjuvants 
include, but are not limited to, aluminum hydroxide (ALHYDROGEL, REHYDRAGEL), 
aluminum phosphate gel, aluminum hydroxyphosphate (ADJU-PHOS), and calcium 
phosphate. 

In another embodiment, hSARS-associated antigen is administered with an 
20 immunostimulatory adjuvant. Such class of adjuvants, include, but are not limited to, 

cytokines (e.g., interleukin-2, interleukin-7, interleukin-12, granulocyte-macrophage colony 
stimulating factor (GM-CSF), interfereon-y interleukin-ip (EL-ip), and EL- 1(3 peptide or 
Sclavo Peptide), cytokine-containing liposomes, triterpenoid glycosides or saponins (e.g., 
QuilA and QS-21, also sold under the trademark STIMULON, ISCOPREP), Muramyl 
25 Dipeptide (MDP) derivatives, such as N-acetyl-muramyl-L-threonyl-D-isoglutamine 
(Threonyl-MDP, sold under the trademark TERMURTEDE), GMDP, N-acetyl-nor- 
muramyl-L-alanyl-D-isoglutamine, N-acetylmuramyl-L-alanyl-D-isoglutaminyl-L-alanine- 
2-(r-2'-dipalmitoyl-sn-glycero-3 -hydroxy phosphoryloxy)-ethylamine, muramyl tripeptide 
phosph aridyl ethanol amine (MTP-PE), unmethylated CpG dinucleotides and 
30 oligonucleotides, such as bacterial DNA and fragments thereof, LPS, monophosphoryl 
Lipid A (3D-MLA sold under the trademark MPL), and polyphosphazenes. 
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In another embodiment, the adjuvant used is a particular adjuvant, including, but not 
limited to, emulsions, e.g., Freund's Complete Adjuvant, Freund's Incomplete Adjuvant, 
squalene or squalane oil-in-water adjuvant formulations, such as SAF and MF59, e.g., 
prepared with block-copolymers, such as L-121 (polyoxypropylene/polyoxyetheylene) sold 
5 under the trademark PLURONIC L-121, Liposomes, Virosomes, cochleat.es, and immune 
stimulating complex, which is sold under the trademark ISCOM. 

In another embodment, a microparticular adjuvant is used., Microparticulare 
adjuvants include, but are not limited to biodegradable and biocompatible polyesters, homo- 
and copolymers of lactic acid (PLA) and glycolic acid (PGA), poly(lactide-co-glycolides) 

1 0 (PLGA) microparticles, polymers that self-associate into particulates (poloxamer particles), 
soluble polymers (polyphosphazenes), and virus-like particles (VLPs) such as recombinant 
protein particulates, e.g., hepatitis B surface antigen (HbsAg). 

Yet another class of adjuvants that may be used include mucosal adjuvants, 
including but not limited to heat-labile enterotoxin from Escherichia coli (LT), cholera 

15 holotoxin (CT) and cholera Toxin B Subunit (CTB) from Vibrio cholerae, mutant toxins 
{e.g., LTK63 and LTR72), microparticles, and polymerized liposomes. 

In other embodiments, any of the above classes of adjuvants may be used in 
combination with each other or with other adjuvants. For example, non-limiting examples 
of combination adjuvant preparations that can be used to administer the hSARS-associated 

20 antigens of the invention include liposomes containing immunostimulatory protein, 

cytokines, or T-cell and/or B-cell peptides, or microbes with or without entrapped FL-2 or 
microparticles containing enterotoxin. Other adjuvants known in the art are also included 
within the scope of the invention (see Vaccine Design: The Subunit and Adjuvant Approach, 
Chap. 7, Michael F. Powell and Mark J. Newman (eds.), Plenum Press, New York, 1995, 

25 which is incorporated herein in its entirety). 

The effectiveness of an adjuvant may be determined by measuring the induction of 
antibodies directed against an immunogenic polypeptide containing a hSARS polypeptide 
epitope, the antibodies resulting from administration of this polypeptide in vaccines which 
are also comprised of the various adjuvants. 

30 The polypeptides may be formulated into the vaccine as neutral or salt forms. 

Pharmaceutically acceptable salts include the acid additional salts (formed with free amino 
groups of the peptide) and which are formed with inorganic acids, such as, for example, 
hydrochloric or phosphoric acids, or organic acids such as acetic, oxalic, tartaric, maleic, 
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and the like. Salts formed with free carboxyl groups may also be derived from inorganic 
bases, such as, for example, sodium potassium, ammonium, calcium, or ferric hydroxides, 
and such organic bases as isopropylamine, trimethylamine, 2-ethylamino ethanol, histidine, 
procaine and the like. 

5 The vaccines of the invention may be multivalent or univalent. Multivalent vaccines 

are made from recombinant viaises that direct the expression of more than one antigen. 

Many methods may be used to introduce the vaccine formulations of the invention; 
these include but are not limited to oral, intradermal, intramuscular, intraperitoneal, 
intravenous, subcutaneous, intranasal routes, and via scarification (scratching through the 
1 0 top layers of skin, e.g. , using a bifurcated needle). 

The patient to which the vaccine is administered is preferably a mammal, most 
preferably a human, but can also be a non-human animal including but not limited to cows, 
horses, sheep, pigs, fowl (e.g., chickens), goats, cats, dogs, hamsters, mice and rats. 

15 5.5 Preparation of Antibodies 

Antibodies which specifically recognize a polypeptide of the invention, such as, but 
not limited to, polypeptides comprising the sequence of SEQ ID NO:2, 12, 14, 2472, 2474, 
and polypeptides as shown in Figures 11 (SEQ ID NOST7-239, 241-736 and 738-1107) 
and 12 (SEQ ID NOS: 1109-1 589, 1591-1964 and 1966-2470), orhSARS epitope or 

20 antigen-binding fragments thereof can be used for detecting, screening, and isolating the 
polypeptide of the invention or fragments thereof, or similar sequences that might encode 
similar enzymes from the other organisms, For example, in one specific embodiment, an 
antibody which immunospecifically binds hSARS epitope, or a fragment thereof, can be 
used for various in vitro detection assays, including enzyme-linked immunosorbent assays 

25 (ELISA), radioimmunoassays, Western blot, etc., for the detection of a polypeptide of the 
invention or, preferably, hSARS, in samples, for example, a biological material, including 
cells, cell culture media (e.g., bacterial cell culture media, mammalian cell culture media, 
insect cell culture media, yeast cell culture media, etc.), blood, plasma, serum, tissues, 
sputum, naseopharyngeal aspirates, etc. 

30 Antibodies specific for a polypeptide of the invention or any epitope of hSARS may 

be generated by any suitable method known in the art. Polyclonal antibodies to an antigen- 
of-interest, for example, the hSARS virus from deposit no. CCTCC-V200303, or comprises 
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a nucleotide sequence of SEQ ID NO: 15, can be produced by various procedures well 
known in the art. For example, an antigen can be administered to various host animals 
including, but not limited to, rabbits, mice, rats, etc., to induce the production of antisera 
containing polyclonal antibodies specific for the antigen. Various adjuvants may be used to 
5 increase the immunological response, depending on the host species, and include but are not 
limited to, Freund's (complete and incomplete) adjuvant, mineral gels such as aluminum 
hydroxide, surface active substances such as lysolecithin, pluronic polyols, polyanions, 
peptides, oil emulsions, keyhole limpet hemocyanins, dinitrophenol, and potentially useful 
adjuvants for humans such as BCG (Bacille Calmctte-Guerin) and Corynebacterium parvum. 

1 0 Such adjuvants are also well known in the art. 

Monoclonal antibodies can be prepared using a wide variety of techniques known in 
the art including the use of hybridoma, recombinant, and phage display technologies, or a 
combination thereof. For example, monoclonal antibodies can be produced using 
hybridoma techniques including those known in the art and taught, for example, in Harlow 

15 et al., Antibodies: A laboratory Manual, (Cold Spring Harbor Laboratory Press, 2nd ed. 
1988); Hammerling, et al., in: Monoclonal Antibodies and T-Cell Hybridomas, pp. 563-681 
(Elsevier, N.Y., 198 1) (both of which are incorporated by reference in their entireties). The 
term "monoclonal antibody" as used herein is not limited to antibodies produced through 
hybridoma technology. The term "monoclonal antibody" refers to an antibody that is 

20 derived from a single clone, including any eukaryotic, prokaryotic, or phage clone, and not 
the method by which it is produced. 

Methods for producing and screening for specific antibodies using hybridoma 
technology are routine and well known in the art. In a non-limiting example, mice can be 
immunized with an antigen of interest or a cell expressing such an antigen. Once an 

25 immune response is detected, e.g., antibodies specific for the antigen are detected in the 
mouse serum, the mouse spleen is harvested and splenocytes isolated. The splenocytes are 
then fused by well known techniques to any suitable myeloma cells. Hybridomas are 
selected and cloned by limiting dilution. The hybridoma clones are then assayed by 
methods known in the art for cells that secrete antibodies capable of binding the antigen. 

30 Ascites fluid, which generally contains high levels of antibodies, can be generated by 
inoculating mice intraperitoneally with positive hybridoma clones. 

Antibody fragments which recognize specific epitopes may be generated by known 
techniques. For example, Fab and F(ab') 2 fragments may be produced by proteolytic 
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cleavage of immunoglobulin molecules, using enzymes such as papain (to produce Fab 
fragments) or pepsin (to produce F(ab') 2 fragments). F(ab') 2 fragments contain the 
complete light chain, and the variable region, the CHI region and the hinge region of the 
heavy chain. 

5 The antibodies of the invention or fragments thereof can be also produced by any 

method known in the art for the synthesis of antibodies, in particular, by chemical synthesis 
or preferably, by recombinant expression techniques. 

The nucleotide sequence encoding an antibody may be obtained from any 
information available to those skilled in the art (i.e., from Genbank, the literature, or by 

10 routine cloning and sequence analysis). If a clone containing a nucleic acid encoding a 

particular antibody or an epitope-binding fragment thereof is not available, but the sequence 
of the antibody molecule or epitope-binding fragment thereof is known, a nucleic acid 
encoding the immunoglobulin may be chemically synthesized or obtained from a suitable 
source (e.g., an antibody cDNA library, or a cDNA library generated from, or nucleic acid, 

15 preferably poly A+ RNA, isolated from any tissue or cells expressing the antibody, such as 
hybridoma cells selected to express an antibody) by PCR amplification using synthetic 
primers hybridizable to the 3 ' and 5' ends of the sequence or by cloning using an 
oligonucleotide probe specific for the particular gene sequence to identify, e.g., a cDNA 
clone from a cDNA library that encodes the antibody. Amplified nucleic acids generated by 

20 PCR may then be cloned into replicable cloning vectors using any method well known in 
the art. 

Once the nucleotide sequence of the antibody is determined, the nucleotide sequence 
of the antibody may be manipulated using methods well known in the art for the 
manipulation of nucleotide sequences, e.g., recombinant DNA techniques, site directed 

25 mutagenesis, PCR, etc. (see, for example, the techniques described in Sambrook et al., supra; 
and Ausubel et al., eds., 1998, Current Protocols in Molecular Biology, John Wiley & Sons, 
NY, which are both incorporated by reference herein in their entireties), to generate 
antibodies having a different amino acid sequence by, for example, introducing amino acid 
substitutions, deletions, and/or insertions into the epitope-binding domain regions of the 

30 antibodies or any portion of antibodies which may enhance or reduce biological activities of 
the antibodies. 

Recombinant expression of an antibody requires construction of an expression 
vector containing a nucleotide sequence that encodes the antibody. Once a nucleotide 
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sequence encoding an antibody molecule or a heavy or light chain of an antibody, or portion 
thereof has been obtained, the vector for the production of the antibody molecule may be 
produced by recombinant DNA technology using techniques well known in the art as 
discussed in the previous sections. Methods which are well known to those skilled in the art 
5 can be used to construct expression vectors containing antibody coding sequences and 
appropriate transcriptional and translational control signals. These methods include, for 
example, in vitro recombinant DNA techniques, synthetic techniques, and in vivo genetic 
recombination. The nucleotide sequence encoding the heavy-chain variable region, light- 
chain variable region, both the heavy-chain and light-chain variable regions, an epitope- 

1 0 binding fragment of the heavy- and/or light-chain variable region, or one or more 

complementarity determining regions (CDRs) of an antibody may be cloned into such a 
vector for expression. Thus-prepared expression vector can be then introduced into 
appropriate host cells for the expression of the antibody. Accordingly, the invention 
includes host cells containing a polynucleotide encoding an antibody specific for the 

1 5 polypeptides of the invention or fragments thereof. 

The host cell may be co-transfected with two expression vectors of the invention, the 
first vector encoding a heavy chain derived polypeptide and the second vector encoding a 
light chain derived polypeptide. The two vectors may contain identical selectable markers 
which enable equal expression of heavy and light chain polypeptides or different selectable 

20 markers to ensure maintenance of both plasmids. Alternatively, a single vector may be used 
which encodes, and is capable of expressing, both heavy and light chain polypeptides. In 
such situations, the light chain should be placed before the heavy chain to avoid an excess 
of toxic free heavy chain (Proudfoot, Nature, 322:52, 1986; and Kohler, Proc. Natl. Acad. 
Sci. USA, 77:2 197, 1980). The coding sequences for the heavy and light chains may 

25 comprise cDNA or genomic DNA. 

In another embodiment, antibodies can also be generated using various phage 
display methods known in the art. In phage display methods, functional antibody domains 
are displayed on the surface of phage particles which carry the polynucleotide sequences 
encoding them. In a particular embodiment, such phage can be utilized to display antigen 

30 binding domains, such as Fab and Fv or disulfide-bond stabilized Fv, expressed from a 
repertoire or combinatorial antibody library (e.g., human or murine). Phage expressing an 
antigen binding domain that binds the antigen of interest can be selected or identified with 
antigen, e.g., using labeled antigen or antigen bound or captured to a solid surface or bead. 
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Phage used in these methods are typically filamentous phage, including fd and Ml 3 . The 
antigen binding domains are expressed as a recombinantly fused protein to either the phage 
gene III or gene VIII protein. Examples of phage display methods that can be used to make 
the immunoglobulins, or fragments thereof of the present invention include those disclosed 
5 in Brinkman et al., J. Immunol. Methods, 182:41-50, 1995; Ames et al., J. Immunol. 
Methods, 184:177-186, 1995; Kettleborough et al., Eur. J. Immunol., 24:952-958, 1994; 
Persic et al., Gene, 187:9-18, 1997; Burton et al., Advances in Immunology, 57:191-280, 
1994; PCT application No. PCT/GB91/01134; PCT publications WO 90/02809; WO 
91/10737; WO 92/01047; WO 92/18619; WO 93/11236; WO 95/15982; WO 95/20401; and 

10 U.S. Patent Nos. 5,698,426; 5,223,409; 5,403,484; 5,580,717; 5,427,908; 5,750,753; 
5,821,047; 5,571,698; 5,427,908; 5,516,637; 5,780,225; 5,658,727; 5,733,743 and 
5,969,108; each of which is incorporated herein by reference in its entirety. 

As described in the above references, after phage selection, the antibody coding 
regions from the phage can be isolated and used to generate whole antibodies, including 

15 human antibodies, or any other desired fragments, and expressed in any desired host, 

including mammalian cells, insect cells, plant cells, yeast, and bacteria, e.g., as described in 
detail below. For example, techniques to recombinantly produce Fab, Fab' and 
fragments can also be employed using methods known in the art such as those disclosed in 
PCT publication WO 92/22324; Mullinax et al., BioTechniques, 12(6):864-869, 1992; and 

20 Sawai et al, ATRI, 34:26-34, 1995; and Better et al., Science, 240:1041-1043, 1988 (each of 
which is incorporated by reference in its entirety). Examples of techniques which can be 
used to produce single-chain Fvs and antibodies include those described in U.S. Patent Nos. 
4,946,778 and 5,258,498; Huston et al., Methods in Enzymology, 203:46-88, 1991; Shu et 
al., PNAS, 90:7995-7999, 1993; and Skerraetal., Science, 240:1038-1040, 1988. 

25 Once an antibody molecule of the invention has been produced by any methods 

described above, it may then be purified by any method known in the art for purification of 
an immunoglobulin molecule, for example, by chromatography (e.g., ion exchange, affinity, 
particularly by affinity for the specific antigen after Protein A or Protein G purification, and 
sizing column chromatography), centrifugation, differential solubility, or by any other 

30 standard technique s for the purification of proteins. Further, the antibodies of the present 
invention or fragments thereof may be fused to heterologous polypeptide sequences 
described herein or otherwise known in the art to facilitate purification. 



43 



WO 2004/085650 



PCT/CN2004/000246 



For some uses, including in vivo use of antibodies in humans and in vitro detection 
assays, it may be preferable to use chimeric, humanized, or human antibodies. A chimeric 
antibody is a molecule in which different portions of the antibody are derived from different 
animal species, such as antibodies having a variable region derived from a murine 
5 monoclonal antibody and a constant region derived from a human immunoglobulin. 
Methods for producing chimeric antibodies are known in the art, See e.g., Morrison, 
Science, 229:1202, 1985; Oi et al., BioTechniques, 4:214 1986; Gillies et al., J. Immunol. 
Methods, 125:191-202, 1989; U.S. Patent Nos. 5,807,715; 4,816,567; and 4,816,397, which 
are incorporated herein by reference in their entireties. Humanized antibodies are antibody 

10 molecules from non-human species that bind the desired antigen having one or more 

complementarity determining regions (CDRs) from the non-human species and framework 
regions from a human immunoglobulin molecule. Often, framework residues in the human 
framework regions will be substituted with the corresponding residue from the CDR donor 
antibody to alter, preferably improve, antigen binding. These framework substitutions are 

1 5 identified by methods well known in the art, e.g., by modeling of the interactions of the 

CDR and framework residues to identify framework residues important for antigen binding 
and sequence comparison to identify unusual framework residues at particular positions. 
See, e.g., Queen et al., U.S. Patent No. 5,585,089; Riechmann et al., Nature, 332:323, 1988,. 
which are incorporated herein by reference in their entireties. Antibodies can be humanized 

20 using a variety of techniques known in the art including, for example, CDR-grafting (EP 
239,400; PCT publication WO 91/09967; U.S. Patent Nos. 5,225,539; 5,530,101 and 
5,585,089), veneering or resurfacing (EP 592,106; EP 519,596; Padlan, Molecular 
Immunology, 28(4/5):489-498, 1991; Studnicka et al., Protein Engineering, 7(6):805-814, 
1994; Roguska et al., ProcNatl. Acad. Sci. USA 91:969-973, 1994), and chain shuffling 

25 (U.S. Patent No. 5,565,332), all of which are hereby incorporated by reference in then 
entireties. 

Completely human antibodies are particularly desirable for therapeutic treatment of 
human patients. Human antibodies can be made by a variety of methods known in the art 
including phage display methods described above using antibody libraries derived from 
30 human immunoglobulin sequences. See U.S. Patent Nos. 4,444,887 and 4,716,1 1 1; and 
PCT publications WO 98/46645; WO 98/50433; WO 98/24893; WO 98/16654; WO 
96/34096; WO 96/33735; and WO 91/10741, each of which is incorporated herein by 
reference in its entirety. 
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Human antibodies can also be produced using transgenic mice which are incapable 
of expressing functional endogenous immunoglobulins, but which can express human 
immunoglobulin genes. For an overview of this technology for producing human 
antibodies, see Lonberg and Huszar, Int. Rev. Immunol., 13:65-93, 1995. For a detailed 
5 discussion of this technology for producing human antibodies and human monoclonal 
antibodies and protocols for producing such antibodies, see, e.g., PCT publications WO 
98/24893; WO 92/01047; WO 96/34096; WO 96/33735; European Patent No. 0 598 877; 
U.S. Patent Nos. 5,413,923; 5,625,126; 5,633,425; 5,569,825; 5,661,016; 5,545,806; 
5,814,3 18; 5,885,793; 5,916,771; and 5,939,598, which are incorporated by reference herein 

10 in their entireties. In addition, companies such as Abgenix, Inc. (Fremont, CA), Medarex 
(NJ) and Genpharm (San Jose, CA) can be engaged to provide human antibodies directed 
against a selected antigen using technology similar to that described above. 

Completely human antibodies which recognize a selected epitope can be generated 
using a technique referred to as "guided selection." In this approach a selected non-human 

1 5 monoclonal antibody, e.g., a mouse antibody, is used to guide the selection of a completely 
human antibody recognizing the same epitope. (Jespers et al., Bio/technology, 12:899-903, 
1988). 

Antibodies fused or conjugated to heterologous polypeptides may be used in in vitro 
immunoassays and in purification methods (e.g., affinity chromatography) well known in 

20 the art. See e.g., PCT publication Number WO 93/21232; EP 439,095; Naramura et al., 

Immunol. Lett., 39:91-99, 1994; U.S. Patent 5,474,981; Gillies et al., PNAS, 89:1428-1432, 
1992; and Fell et al., J. Immunol., 146:2446-2452, 1991, which are incorporated herein by 
reference in their entireties. 

Antibodies may also be attached to solid supports, which are particularly useful for 

25 immunoassays or purification of the polypeptides of the invention or fragments, derivatives, 
analogs, or variants thereof, or similar molecules having the similar enzymatic activities as 
the polypeptide of the invention. Such solid supports include, but are not limited to, glass, 
cellulose, poly aery lamide, nylon, polystyrene, polyvinyl chloride or polypropylene. 



30 5.6 Pharmaceutical Compositions and Kits 

The present invention encompasses pharmaceutical compositions comprising anti- 
viral agents of the present invention. In a specific embodiment, the anti-viral agent is an 



45 



WO 2004/085650 



PCT/CN2004/000246 



antibody which immunospecifically binds and neutralize the hSARS virus or variants 
thereof, or any proteins derived therefrom. In another specific embodiment, the anti-viral 
agent is a polypeptide or nucleic acid molecule of the invention. The pharmaceutical 
compositions have utility as an anti-viral prophylactic agent and may be administered to a 
5 subject where the subject has been exposed or is expected to be exposed to a virus. 
Various delivery systems are known and can be used to administer the 
pharmaceutical composition of the invention, e.g., encapsulation in liposomes, 
microparticles, microcapsules, recombinant cells capable of expressing the mutant viruses, 
receptor mediated endocytosis (see, e.g., Wu and Wu, 1987, J. Biol. Chem. 262:4429 4432). 

10 Methods of introduction include but are not limited to intradermal, intramuscular, 
intraperitoneal, intravenous, subcutaneous, intranasal, epidural, and oral routes. The 
compounds may be administered by any convenient route, for example by infusion or bolus 
injection, by absorption through epithelial or mucocutaneous linings (e.g., oral mucosa, 
rectal and intestinal mucosa, etc.) and may be administered together with other biologically 

1 5 active agents. Administration can be systemic or local. In a preferred embodiment, it may 
be desirable to introduce the pharmaceutical compositions of the invention into the lungs by 
any suitable route. Pulmonary administration can also be employed, e.g., by use of ah 
inhaler or nebulizer, and formulation with an aerosolizing agent. 

In a specific embodiment, it may be desirable to administer the pharmaceutical 

20 compositions of the invention locally to the area in need of treatment; this may be achieved 
by, for example, and not by way of limitation, local infusion during surgery, topical 
application, e.g., in conjunction with a wound dressing after surgery, by injection, by means 
of a catheter, by means of a suppository, or by means of an implant, said implant being of a 
porous, non porous, or gelatinous material, including membranes, such as sialastic 

25 membranes, or fibers. In one embodiment, administration can be by direct injection at the 
site (or former site) infected tissues. 

In another embodiment, the pharmaceutical composition can be delivered in a 
vesicle, in particular a liposome (see Langer, 1990, Science 249:1527-1533; Treat et al., in 
Liposomes in the Therapy of Infectious Disease and Cancer, Lopez Berestein and Fidler 

30 (eds.), Liss, New York, pp. 353-365 (1989); Lopez-Berestein, ibid. , pp. 3 17-327; see 
generally ibid ). 

In yet another embodiment, the pharmaceutical composition can be delivered in a 
controlled release system. In one embodiment, a pump may be used (see Langer, supra; 
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Sefton, 1987, CRC Crit. Ref. Biomed. Eng. 14:201; Buchwald et al.,1980, Surgery 88:507; 
and Saudek et al., 1989, N. Engl. J. Med. 321:574). In another embodiment, polymeric 
materials can be used (see Medical Applications of Controlled Release, Langer and Wise 
(eds.), CRC Pres., Boca Raton, Florida (1974); Controlled Drug Bioavailability, Drug 
5 Product Design and Performance, Smolen and Ball (eds.), Wiley, New York (1984); Ranger 
and Peppas, J. Macromol. Sci. Rev. Macromol. Chem. 23:61 (1983); see also Levy et al., 
1985, Science 228:190; During et al., 1989, Ann Neurol. 25:351; Howard et al., 1989, J. 
Neurosurg. 71 : 105). In yet another embodiment, a controlled release system can be placed 
in proximity of the composition's target, i.e., the lung, thus requiring only a fraction of the 

10 systemic dose (see, e.g., Goodson, in Medical Applications of Controlled Release, supra, 
vol. 2, pp. 115-138 (1984)). 

Other controlled release systems are discussed in the review by Langer (Science 
249:1527-1533 (1990)). 

The pharmaceutical compositions of the present invention comprise a 

1 5 therapeutically effective amount of an live attenuated, inactivated or killed hSARS virus, or 
recombinant or chimeric hS ARS virus, and a pharmaceutically acceptable carrier. In a 
specific embodiment, the term "pharmaceutically acceptable" means approved by a 
regulatory agency of the Federal or a state government or listed in the U.S. Pharmacopeia or 
other generally recognized pharmacopeia for use in animals, and more particularly in 

20 humans. The term "carrier" refers to a diluent, adjuvant, excipient, or vehicle with which 
the pharmaceutical composition is administered. Such pharmaceutical carriers can be sterile 
liquids, such as water and oils, including those of petroleum, animal, vegetable or synthetic 
origin, such as peanut oil, soybean oil, mineral oil, sesame oil and the like. Water is a 
preferred carrier when the pharmaceutical composition is administered intravenously. 

25 Saline solutions and aqueous dextrose and glycerol solutions can also be employed as liquid 
carriers, particularly for injectable solutions. Suitable pharmaceutical excipients include 
starch, glucose, lactose, sucrose, gelatin, malt, rice, flour, chalk, silica gel, sodium stearate, 
glycerol monostearate, talc, sodium chloride, dried skim milk, glycerol, propylene, glycol, 
water, ethanol and the like. The composition, if desired, can also contain minor amounts of 

30 wetting or emulsifying agents, or pH buffering agents. These compositions can take the 

form of solutions, suspensions, emulsion, tablets, pills, capsules, powders, sustained release 
formulations and the like. The composition can be formulated as a suppository, with 
traditional binders and carriers such as triglycerides. Oral formulation can include standard 
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carriers such as pharmaceutical grades of mannitol, lactose, starch, magnesium stearate, 
sodium saccharine, cellulose, magnesium carbonate, etc. Examples of suitable 
pharmaceutical carriers are described in "Remington's Pharmaceutical Sciences" by E.W. 
Martin. The formulation should suit the mode of administration. 
5 In a preferred embodiment, the composition is formulated in accordance with 

routine procedures as a pharmaceutical composition adapted for intravenous administration 
to human beings. Typically, compositions for intravenous administration are solutions in 
sterile isotonic aqueous buffer. Where necessary, the composition may also include a 
solubilizing agent and a local anesthetic such as lignocaine to ease pain at the site of the 

10 injection. Generally, the ingredients are supplied either separately or mixed together in unit 
dosage form, for example, as a dry lyophilized powder or water free concentrate in a 
hermetically sealed container such as an ampoule or sachette indicating the quantity of 
active agent. Where the composition is to be administered by infusion, it can be dispensed 
with an infusion bottle containing sterile pharmaceutical grade water or saline. Where the 

1 5 composition is administered by injection, an ampoule of sterile water for injection or saline 
can be provided so that the ingredients may be mixed prior to administration. 

The pharmaceutical compositions of the invention can be formulated as neutral or 
salt forms. Pharmaceutically acceptable salts include those formed with free amino groups 
such as those derived from hydrochloric, phosphoric, acetic, oxalic, tartaric acids, etc., and 

20 those formed with free carboxyl groups such as those derived from sodium, potassium, 
ammonium, calcium, ferric hydroxides, isopropylamine, triethylamine, 2 ethylamino 
ethanol, histidine, procaine, etc. 

The amount of the pharmaceutical composition of the invention which will be 
effective in the treatment of a particular disorder or condition will depend on the nature of 

25 the disorder or condition, and can be determined by standard clinical techniques. In 

addition, in vitro assays may optionally be employed to help identify optimal dosage ranges. 
The precise dose to be employed in the formulation will also depend on the route of 
administration, and the seriousness of the disease or disorder, and should be decided 
according to the judgment of the practitioner and each patient's circumstances. However, 

30 suitable dosage ranges for intravenous administration are generally about 20 500 

micrograms of active compound per kilogram body weight. Suitable dosage ranges for 
intranasal administration are generally about 0.01 pg/kg body weight to 1 mg/kg body 
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weight. Effective doses may be extrapolated from dose response curves derived from in 
vitro or animal model test systems. 

Suppositories generally contain active ingredient in the range of 0.5% to 10% by 
weight; oral formulations preferably contain 10% to 95% active ingredient. 
5 The invention also provides a pharmaceutical pack or kit comprising one or more 

containers filled with one or more of the ingredients of the pharmaceutical compositions of 
the invention. Optionally associated with such container(s) can be a notice in the form 
prescribed by a governmental agency regulating the manufacture, use or sale of 
pharmaceuticals or biological products, which notice reflects approval by the agency of 
10 manufacture, use or sale for human administration. In a preferred embodiment, the kit 

contains an anti-viral agent of the invention, e.g., an antibody specific for the polypeptides 
encoded by a nucleotide sequence of SEQ ID NO:l, 11, 13, 15, 2471 or 2473, or as shown 
in Figures 11 (SEQ ID NOS: 17-239, 241-736 and 738-1107) and 12 (SEQ ID NOS:l 109- 
1589, 1591-1964 and 1966-2470), or any hSARS epitope, or a polypeptide or protein of the 
1 5 present invention, or a nucleic acid molecule of the invention, alone or in combination with 
adjuvants, antivirals, antibiotics, analgesic, bronchodialaters, or other pharmaceutically 
acceptable excipients. 

The present invention further encompasses kits comprising a container containing a 
pharmaceutical composition of the present invention and instructions to for use. 

20 

5.7 Detection Assays 

The present invention provides a method for detecting an antibody, which 
immunospecifically binds to the hSARS virus, in a biological sample, for example blood, 
serum, plasma, saliva, urine, etc., from a patient suffering from SARS. In a specific 

25 embodiment, the method comprising contacting the sample with the hS ARS virus, for 

example, of deposit no. CCTCC-V200303, or having a genomic nucleic acid sequence of 
SEQ ID NO: 15, directly immobilized on a substrate and detecting the virus-bound antibody 
directly or indirectly by a labeled heterologous anti-isotype antibody. In another specific 
embodiment, the sample is contacted with a host cell which is infected by the hSARS virus, 

30 for example, of deposit no. CCTCC-V200303, or having a genomic nucleic acid sequence 
of SEQ ID NO: 15, and the bound antibody can be detected by immunofluorescent assay as 
described in Section 6.5, infra. 
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An exemplary method for detecting the presence or absence of a polypeptide or 
nucleic acid of the invention in a biological sample involves obtaining a biological sample 
from various sources and contacting the sample with a compound or an agent capable of 
detecting an epitope or nucleic acid (e.g., mRNA, genomic RNA) of the hSARS virus such 
5 that the presence of the hSARS virus is detected in the sample. A preferred agent for 

detecting hSARS mRNA or genomic RNA of the invention is a labeled nucleic acid probe 
capable of hybridizing to mRNA or genomic RNA encoding a polypeptide of the invention. 
The nucleic acid probe can be, for example, a nucleic acid molecule comprising or 
consisting of the nucleotide sequence or SEQ ID NO:l, 1 1, 13, 15, 2471, or 2473, or a 

10 portion thereof, such as an oligonucleotide of at least 15, 20, 25, 30, 50, 100, 250, 500, 750, 
1,000 or more contiguous nucleotides in length and sufficient to specifically hybridize 
under stringent conditions to a hSARS mRNA or genomic RNA. 

In another preferred specific embodiment, the presence of hSARS virus is detected 
in the sample by an reverse transcription polymerase chain reaction (RT-PCR) using the 

1 5 primers that are constructed based on a partial nucleotide sequence of the genome of 
hSARS virus, for example, that of deposit accession no. CCTCC-V200303, or having a 
genomic nucleic acid sequence of SEQ ED NO: 15, or based on a nucleotide sequence of 
SEQ ID NO: 1, 1 1, 13, 2471 or 2473. In a non-limiting specific embodiment, preferred 
primers to be used in a RT-PCR method are: 5'-TACACACCTCAGC-GTTG-3' (SEQ ID 
' 20 NO:3) and 5 ' -C ACGAACGTGACG- AAT-3 ' (SEQ ID NO:4), in the presence of 2.5 mM 
MgCl 2 and the thermal cycles are, for example, but not limited to, 94 °C for 8 min followed 
by 40 cycles of 94 °C for 1 min, 50 °C for 1 min, 72 °C for 1 min (also see Sections 6.7 
and 6.8 infra). In preferred embodiments, the primers comprise nucleic acid sequence of 
SEQ ID NOS:2475 and 2476, or SEQ ED NOS:2480 and 2481. In preferred embodiments, 

25 the thermal cycles are 94 °C for 10 min followed by 40 cycles of 94 °C for 30 seconds, 56 
°C for 30 seconds, 72 °C for 30 seconds, 72°C for 10 minutes. In preferred embodiments, 
the primers comprise nucleic acid sequence of SEQ ED NOS .2477 and 2478. In more 
preferred specific embodiment, the present invention provides a real-time quantitative PGR 
assay to detect the presence of hSARS virus in a biological sample by subjecting the cDNA 

3 0 obtained by reverse transcription of the extracted total RNA from the sample to PGR 

reactions using the specific primers, such as those having nucleotide sequences of SEQ ED 
NOS:3 and 4, and a fluorescence dye, such as SYBR® Green I, which fluoresces when 
bound non-specifically to double-stranded DNA. The fluorescence signals from these 
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reactions are captured at the end of extension steps as PCR product is generated over a 
range of the thermal cycles, thereby allowing the quantitative determination of the viral load 
in the sample based on an amplification plot {see Section 6.7, infra). 

A preferred agent for detecting hSARS is an antibody that specifically binds a 
5 polypeptide of the invention or any hSARS epitope, preferably an antibody with a 

detectable label. Antibodies can. be polyclonal, or more preferably, monoclonal. An intact 
antibody, or a fragment thereof (e.g., Fab or F(ab') 2 ) can be used. 

The term "labeled", with regard to the probe or antibody, is intended to encompass 
direct labeling of the probe or antibody by coupling (i.e., physically linking) a detectable 

10 substance to the probe or antibody, as well as indirect labeling of the probe or antibody by 
reactivity with another reagent that is directly labeled. Examples of indirect labeling 
include detection of a primary antibody using a fluorescently labeled secondary antibody 
and end-labeling of a DNA probe with biotin such that it can be detected with fluorescently 
labeled streptavidin. The detection method of the invention can be used to detect mRNA, 

15 protein (or any epitope), or genomic RNA in a sample in vitro as well as in vivo. For 

example, in vitro techniques for detection of mRNA include northern hybridizations, in situ 
hybridizations, RT-PCR, and RNase protection. In vitro techniques for detection of an 
epitope of hSARS include enzyme linked immunosorbent assays (ELISAs), Western blots, 
immunoprecipitations and immunofluorescence. In vitro techniques for detection of 

20 genomic RNA include nothern hybridizations, RT-PCT, and RNase protection. 

Furthermore, in vivo techniques for detection of hSARS include introducing into a subject 
organism a labeled antibody directed against the polypeptide. For example, the antibody 
can be labeled with a radioactive marker whose presence and location in the subject 
organism can be detected by standard imaging techniques, including autoradiography. 

25 In a specific embodiment, the methods further involve obtaining a control sample 

from a control subject, contacting the control sample with a compound or agent capable of 
detecting hSARS, e.g., a polypeptide of the invention or mRNA or genomic RNA encoding 
a polypeptide of the invention, such that the presence of hSARS or the polypeptide or 
mRNA or genomic RNA encoding the polypeptide is detected in the sample, and comparing 

30 the presence of hSARS or the polypeptide or mRNA or genomic RNA encoding the 

polypeptide in the control sample with the presence of hSARS, or the polypeptide or mRNA 
or genomic DNA encoding the polypeptide in the test sample. 
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The invention also encompasses kits for detecting the presence of hSARS or a 
polypeptide or nucleic acid of the invention in a test sample. The kit, for example, can 
comprise a labeled compound or agent capable of detecting hSARS or the polypeptide or a 
nucleic acid molecule encoding the polypeptide in a test sample and, in certain 
5 embodiments, a means for determining the amount of the polypeptide or mRNA in the 
sample (e.g., an antibody which binds the polypeptide or an oligonucleotide probe which 
binds to DNA or mRNA encoding the polypeptide). Kits can also include instructions for 
use. 

For antibody-based kits, the kit can comprise, for example: (1) a first antibody (e.g., 
10 attached to a solid support) which binds to a polypeptide of the invention or hSARS epitope; 
and, optionally, (2) a second, different antibody which binds to either the polypeptide or the 
first antibody and is conjugated to a detectable agent. 

For oligonucleotide-based kits, the kit can comprise, for example: (1) an 
oligonucleotide, e.g., a detectably labeled oligonucleotide, which hybridizes to a nucleic 
15 acid sequence encoding a polypeptide of the invention or to a sequence within the hSARS 
genome or (2) a pair of primers useful for amplifying a nucleic acid molecule containing an 
hSARS sequence. The kit can also comprise, e.g., a buffering agent, a preservative, or a 
protein stabilizing agent. The kit can also comprise components necessary for detecting the 
detectable agent (e.g., an enzyme or a substrate). The kit can also contain a control sample 
20 or a series of control samples which can be assayed and compared to the test sample 

contained. Each component of the kit is usually enclosed within an individual container and 
all of the various containers are within a single package along with instructions for use. 

5.8 Screening Assays to Identify Anti- Viral Agents 

25 The invention provides methods for the identification of a compound that inhibits 

the ability of hSARS virus to infect a host or a host cell. In certain embodiments, the 
invention provides methods for the identification of a compound that reduces the ability of 
hSARS virus to replicate in a host or a host cell. Any technique well-known to the skilled 
artisan can be used to screen for a compound that would abolish or reduce the ability of 

3 0 hSARS virus to infect a host and/or to replicate in a host or a host cell. 

In certain embodiments, the invention provides methods for the identification of a 
compound that inhibits the ability of hSARS virus to replicate in a mammal or a 
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mammalian cell. More specifically, the invention provides methods for the identification of 
a compound that inhibits the ability of hSARS virus to infect a mammal or a mammalian 
cell. In certain embodiments, the invention provides methods for the identification of a 
compound that inhibits the ability of hSARS virus to replicate in a mammalian cell. In a 
5 specific embodiment, the mammalian cell is a human cell. 

In another embodiment, a cell is contacted with a test compound and infected with 
the hSARS virus. In certain embodiments, a control culture is infected with the hSARS 
virus in the absence of a test compound. The cell can be contacted with a test compound 
before, concurrently with, or subsequent to the infection with the hSARS virus. In a 

10 specific embodiment, the cell is a mammalian cell. In an even more specific embodiment, 
the cell is a human cell. In certain embodiments, the cell is incubated with the test 
compound for at least 1 minute, at least 5 minutes at least 15 minutes, at least 30 minutes, at 
least 1 hour, at least 2 hours, at least 5 hours, at least 12 hours, or at least 1 day. The titer of 
the virus can be measured at any time during the assay. In certain embodiments, a time 

1 5 course of viral growth in the culture is determined. If the viral growth is inhibited or 
reduced in the presence of the test compound, the test compound is identified as being 
effective in inhibiting or reducing the growth or infection of the hSARS virus. In a specific 
embodiment, the compound that inhibits or reduces the growth of the hSARS virus is tested 
for its ability to inhibit or reduce the growth rate of other viruses to test its specificity for the 

20 hSARS virus. 

In one embodiment, a test compound is administered to a model animal and the 
model animal is infected with the hSARS virus. In certain embodiments, a control model 
animal is infected with the hSARS virus without the administration of a test compound. 
The test compound can be administered before, concurrently with, or subsequent to the 

25 infection with the hSARS virus. In a specific embodiment, the model animal is a mammal. 
In an even more specific embodiment, the model animal can be, but is not limited to, a 
cotton rat, a mouse, or a monkey. The titer of the virus in the model animal can be 
measured at any time during the assay. In certain embodiments, a time course of viral 
growth in the culture is determined. If the viral growth is inhibited or reduced in the 

30 presence of the test compound, the test compound is identified as being effective in 
inhibiting or reducing the growth or infection of the hSARS virus. In a specific 
embodiment, the compound that inhibits or reduces the growth of the hSARS in the model 
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animal is tested for its ability to inhibit or reduce the growth rate of other viruses to test its 
specificity for the hSARS virus. 

6. EXAMPLES 

5 The following examples illustrate the isolation and identification of the novel 

hS ARS virus. These examples should not be construed as hmiting. 

METHODS AND RESU LTS 

As a general reference, Wiedbrauk Dl & Johnston SLG. (Manual of Clinical 
1 0 Virology, Raven Press, New York, 1993) was used. 

6.1 Clinical Subjects 

The study included all 50 patients who fitted a modified World Health Organization 
(WHO) definition of SARS and were admitted to 2 acute regional hospitals in Hong Kong 

1 5 Special Administrative Region (HKSAR) between February 26 to March 26, 2003 (WHO . 
Severe acute respiratory syndrome (SARS) Weekly Epidemiol Rec. 2003; 78: 81-83). A 
lung biopsy from an additional patient, who had typical SARS and was admitted to a third 
hospital, was also included in the study. Briefly, the case definition for SARS was: (i) fever 
of 38°C or more; (ii) cough or shortness of breath; (iii) new pulmonary infiltrates on chest 

20 radiograph; and (iv) either a history of exposure to a patient with SARS or absence of 
response to empirical antimicrobial coverage for typical and atypical pneumonia (beta- 
lactams and macrolides, fluoroquinolones or tetracyclines). 

Nasopharyngeal aspirates and serum samples were collected from all patients. 
Paired acute and convalescent sera and feces were available from some patients. Lung 

25 biopsy tissue from one patient was processed for a viral culture, RT-PCR, routine 

histopathological examination, and electron microscopy. Nasopharyngeal aspirates, feces 
and sera submitted for microbiological investigation of other diseases were included in the 
study under blinding and served as controls. 

The medical records were reviewed retrospectively by the attending physicians and 

30 clinical microbiologists. Routine hematological, biochemical and microbiological 
examinations, including bacterial culture of blood and sputum, serological study and 
collection of nasopharyngeal aspirates for virological tests, were carried out. 



54 



WO 2004/085650 



PCT/CN2004/000246 



6.2 Cell Line 

FRhK-4 (fetal rhesus monkey kidney) cells were maintained in minimal essential 
medium (MEM) with 1% fetal calf serum, 1% streptomycin and penicillin, 0.2% nystatin 
5 and 0.05% garamycin. 



6.3 Viral Infection 

Two-hundred ul of clinical (nasopharyngeal aspirates) samples, from two patients 
(see the Result section, infra), in virus transport medium were used to infect FRhk-4 cells. 

10 The inoculated cells were incubated at 37°C for 1 hour. One ml of MEM containing 1 ug 
trypsin was then added to the culture and the infected cells were incubated in a 37 D C 
incubator supplied with 5% carbon dioxide. Cytopathic effects were observed in the 
infected cells after 2 to 4 days of incubation. The infected cells were passaged into new 
FRhK-4 cells and cytopathic effects were observed within 1 day after the inoculation. The 

1 5 infected cells were tested by an immunofmorescent assay for influenza A., influenza B, 
respiratory syncytial virus, parainfluenza types 1, 2 and 3, adenovirus and human 
metapneumovirus (hMPV) and negative resists were obtained for all cases. The infected 
cells were also tested by RT-PCR for influenza A and human metapneumovirus with 
negative results. 

20 

6.4 Virus Morphology 

The infected cells prepared as described above were harvested, pelleted by 
centrifugation and the cell pellets were processed for thin-section transmitted electron 
microscopic visualization. Viral particles were identified in the cells infected with both 

25 clinical specimens, but not in control cells which were not infected with the virus. Virions 
isolated from the infected cells were about 70-100 nanometers (Figure 2). Viral capsids 
were found predominantly within the vesicles of the golgi and endoplasmic reticulum and 
were not free in the cytoplasm. Virus particles were also found at the cell membrane. 
One virus isolate was ultracentrifuged and the cell pellet was negatively stained 

30 using phosphotugstic acid. Virus particles characteristic of Coronaviridae were thus 

visualized. Since the human Coronaviruses hitherto recognized are not known to cause a 
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similar disease, the present inventors postulated that the virus isolates represent a novel 
virus that infects humans. 

6.5 Antibody Response to the Isolated Virus 

5 To further confirm that this novel virus is responsible for causing SARS in the 

infected patients, blood serum samples from the patients who were suffering from SARS 
were obtained and a neutralization test was performed. Typically diluted serum (x50, x200, 
x800 and xl600) was incubated with acetone-fixed FRhK-4 cells infected with hSARS at 
37°C for 45 minutes. The incubated cells were then washed with phosphate-buffered saline 

10 and stained with anti-human IgG-FITC conjugated antibody. The cells were then washed 
and examined under a fluorescent microscope. In these experiments, positive signals were 
found in 8 patients who had SARS (Figure 3), indicating that these patients had an IgG 
antibody response to this novel human respiratory virus of Coronaviridae. By contrast, no 
signal was detected in 4 negative-control paired sera. The serum titers of anti- hS ARS 

15 antibodies of the tested patients are shown in Table 1 . 
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Table 1 



Name 


Date 


Lab No. 


Anti-SARS 


Patient A 


25-Feb-03 


S2728 


<50 




6-Mar-03 


S2728 


1600 


Patient B 


26-Feb-03 


S2441 


50 




3-Mar-03 


S2441 


200 


Patient C 


4-Mar-03 


S3279 


20Q i 




14-Mar-03 


S3279 


1600 


Patient D 


6-Mar-03 


M41045 


<50 




1 1 -Mar-03 


MB943703 


800 


Patient E 


4-Mar-03 


M38953 


<50 




18-Mar-03 


KWH03/3601 


800 


Control F 


13-Feb-03 


M27124 


<50 




1 -Mar-03 


MB942968 


<50 


Patient G 


3-Mar-03 


M38685 


<50 




7-Mar-03 


KWH03/2900 


Equivocal 



Blinded samples: 






1a* 


Acute 


<50 


1b 


Convalescent 


1600 


2a* 


Acute 


50 


2b 


Convalescent 


>1600 


3a* 


Acute 


50 


3b 


Convalescent 


>1600 


4a* 


Acute 


<50 


4b 


Convalescent 


<50 


5a* 


Acute 


<50 


5b 


Convaelscent 


<50 


6a* 


Acute 


<50 


6b 


Convalescent 


<50 



NB: * patients with SARS 

These results indicated that this novel member of Coronaviridae is a key pathogen 



6.6 Sequences of the hSARS Virus 

Total RNA from infected or uninfected FrHK-4 cells was harvested two days post- 
infection. One-hundred ng of purified RNA was reverse transcribed using Superscript II 
1 0 reverse transcriptase (Invitrogen) in a 20 \i{ reaction mixture containing 1 0 pg of a 

degenerated primer (5 ' -G CCGG AGCTCTGC AGA ATTCNN NTS NN N-3 ' , N=A, T, G or C: 
SEQ ID NO: 5) as recommended by the manufacturer. Reverse transcribed products were 
then purified by a QIAquick PGR purification kit as instructed by the manufacturer and 
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eluted in 30 pi of 10 mM Tris-HCl, pH 8.0 . Three pi of purified cDNA products were add 
in a 25 reaction mixture containing 2.5 \il of lOx PGR buffer, 4 pi of 25mM MgCl 2 , 0.5 
pi of 10 mM dNTP, 0.25 pi of AmpliTaq Gold® DNA polymerase (Applied Biosystems), 
2.5 pCi of [a- 32 P]CTP (Amersham), 2 pi of 10 nM primer (5'- 
5 GCCGGAGCTCTGCAGAATT-C-3 ' : SEQ ID NO : 6). Reactions were thermal cycled 
through the following profile: 94°C for 8 min followed by 2 cycles of 94°C for 1 min, 
40°C for 1 min, 72°C for 2 min. This temperature profile was followed by 35 cycles of 
94°C for 1 min, 60°C for 1 min, 72°C for 1 min. 6 |il of the PCR products were analyzed in 
a 5% denaturing polyacrylamide gel electrophoresis. Gel was exposed to X-ray film and the 

1 0 film was developed after an over-night exposure. Unique PCR products which were only 
identified in infected cell samples were isolated from the gel and eluted in a 50 pi of lx TE 
buffer. Eluted PCR products were then re-amplified in 25 pi of reaction mixture containing 
2.5 pi of lOx PCR buffer, 4 pi of 25 mM MgCl 2 , 0.5 pi ru 10 mM dNTP, 0.25 pi of 
AmpliTaq Gold® DNA polymerase (Applied Biosystems), 1 pi of 10 pM primer (5'- 

1 5 GCCGGAGCTCTGCAGAATTC-3 ' : SEQ ID NO :6) . Reaction mixtures were thermal 
cycled through the following profile: 94°C for 8 min followed by 35 cycles of 94°C for 1 
min, 60°C for 1 min, 72°C for 1 min. PCR products were cloned using a TOPO TA cloning 
kit (Invitrogen) and ligated plasmids were transformed into TOP 10 E. coli competent cells 
(Invitrogen). PCR inserts were sequenced by a BigDye cycle sequencing kit as 

20 recommended by the manufacturer (Applied Biosystems) and sequencing products were 
analyzed by an automatic sequencer (Applied Biosystems, model number 3770). The 
obtained sequence (SEQ ID NO: 1) is shown in Figure 1 . The deducted amino acid 
sequence (SEQ ID NO: 2) from the obtained DNA sequence showed 57% homology to the 
polymerase protein of identified coronaviruses. 

25 Similarly, two other partial sequences (SEQ ID NOS.l 1 and 13) and deduced amino 

acid sequences (SEQ ID NOS:12 and 14, respectively) were obtained from the hSARS virus 
and are shown in Figures 8 (SEQ ID NOS:ll and 12) and 9 (SEQ ID NOS:13 and 14). 

The entire genomic sequence of hSARS virus is shown in Figure 10 (SEQ ID 
NO: 15). The deduced amino acid sequences of SEQ ID NO: 15 in all three frames (SEQ ID 

30 NO: 16, 240 and 737) are shown in Figure 11 (SEQ ID NOS: 17-239, 241-736 and 738- 

1 107). The deduced amino acid sequences of the complement of SEQ ID NO : 15 in all three 
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frames (SEQ ID NOS : 1 1 08, 1590 and 1965) are shown in Figure 12 (SEQ ID NOS:1109- 
1589, 1591-1964 and 1966-2470). 

6.7 Detection of hSARS Virus in Nasopharyngeal Aspirates 

5 First, the nasopharyngeal aspirates (HP A) were examined by rapid 

immunoflourescent antigen detection for influenza A and B, parainfluenza types 1, 2 and 3, 
respiratory syncytial virus and adenovirus (Chan KH, Maldeis N, Pope W, Yup A, Ozinskas 
A. Gill J, Seto WH, Shortridge KF, Peiris JSM. Evaluation of Directigen Fly A+B test for 
rapid diagnosis of influenza A and B virus infections. J Clin Microbiol. 2002; 40: 1 675- 

10 1 680) and were cultured for conventional respiratory pathogens on Mardin Darby Canine 
Kidney, LLC-Mk2, RDE, Hep-2 and MRC-5 cells (Wiedbrauk DL, Johnston SLG. Manual 
of clinical virology. Raven Press, New York. 1993). Subsequently, fetal rhesus kidney 
(FRhk-4) and A-549 cells were added to the panel of cell lines used. Reverse transcription 
polymerase chain reaction (RT-PCR) was performed directly on the clinical specimen for 

15 influenza A (Fouchier RA, Bestebroer TM, Herfst S, Van Der Kemp L, Rimmelzwan GF, 
Osterhaus AD. Detection of influenza A virus from different species by PCR amplification 
of conserved sequences in the matrix gene. J Clin Microbiol. 2000; 38: 4096- 1 0 1) and 
human metapneumovirus (HMPV). The primers used for HMPV were: for first round, 5'- 
AARGTS AATGCATCAGC-3 ' (SEQ ID NO. 7) and 5 ' -C AKATTYTGCTTATGCTTTC- 

20 3' (SEQ ID NO:8); and nested primers: 5 3 -AC ACCTGTTAC AAT ACCAGC-3 ' (SEQ JD 
NO:9) and 5'-GACTTGAGTCCCAGCTCCA-3' (SEQ JD NO: 10). The size of the nested 
PCR product was 201 bp. An ELISA for mycoplasma was used to screen cell cultures 
(Roche Diagnostics GmbH, Roche, Indianapolis, USA). 

25 RT-PCR Assa y 

Subsequent to culturing and genetic sequencing of the hSARS virus from two 
patients {see Section 6.6, supra), an RT-PCR was developed to detect the hSARS virus 
sequence from NPA samples. Total RNA from clinical samples was reverse transcribed 
using random hexamers and cDNA was amplified using primers 5 : -TAG AC ACCTC AGC- 
30 GTTG-3 ' (SEQ ID NO:3) and 5 '-CACGAACGTGACGAAT-3 ' (SEQ ID NO:4), which are 
constructed based on the in the presence of 2.5 mM MgCl 2 (94 °C for 8 min followed by 40 
cycles of 94 °C for 1 min, 50 °C for 1 min, 72 °C for 1 min). 
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The summary of a typical RT-PCR protocol is as follows: 

1. RNA extraction 

RNA from 140 ul of NPA samples is extracted by QIAquick viral RNA extraction 
kit and is eluted in 50 ul of elution buffer. 

2. Reverse transcription 

RNA 11.5 ul 

0.1MDTT 2 pi 

5x buffer 4 pi 

lOmMdNTP 1 |xl 

Superscript II, 200 U/pl (Invitrogen) 1 pi 
Random hexamers, 0.3 pg/ pi 0.5 pi 

Reaction condition 42 °C, 50 min 
94 °C, 3 min 
4°C 



3. PGR 

cDNA generated by random primers is amplified in a 50 ul reaction as follows: 



cDNA 



10 mM dNTP 



2 pi 
0.5 pi 



lOx buffer 



5 pi 




25 pM Forward primer 0.5 pi 

25 pM Reverse primer 0.5 pi 

AmpliTaq Gold polymerase, 5U/pl (Applied Biosystems) 0.25 pi 
Water 36.25 pi 
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Thermal-cycle condition: 95°C, 10 min, followed by 40 cycles of 95 °C, 1 min; 
50°C 1 min; 72 °C, 1 min. 

4. Primer sequences 

5 Primers were designed based on the RNA-dependent RNA polymerase encoding 

sequence (SEQ ID NO:l) of the hSARS virus. 

Forward primer: 5' TACACACCTCAGCGTTG 3 ' (SEQ ID NO:3) 
Reverse primer: 5' CACGAACGTGACGAAT 3' (SEQ ID NO :4) 

10 Product size: 182 bps 

Real-Time Quantitative PCJR Asssry 

Total RNA from 140 pi of nasopharyngeal aspirate (NPA) was extracted by 
QIAamp® virus RNA mini kit (Qiagen) as instructed by the manufacturer. Ten pi of eluted 

1 5 RNA samples were reverse transcribed by 200 U of Superscript II® reverse transcriptase 
(Invitrogen) in a 20 pi reaction mixture containing 0. 15 pg of random hexamers, 10 mmol/1 
DTT, and 0.5 mmol/1 dNTP, as instructed. Complementary DNA was then amplified in a 
SYBR® Green I fluorescence reaction (Roche) mixtures. Briefly, 20 pi reaction mixtures 
containing 2 pi of cDNA, 3.5 mmol/1 MgCI 2 , 0.25 pmol/1 of forward primer (5 - 

20 TACACACCTCAGCGTTG-3'; SEQ ID NO:3) and 0.25 pmol/1 reverse primer (5'- 
CACGAACGTGACGAAT-3 1 ; SEQ ID NO:4) were thermal-cycled by a Light-Cycler 
(Roche) with the PCR program, [ 95°C, 10 min followed by 50 cycles of 95°C, 10 min; 
57°C, 5 sec; 72°C 9 sec]. Plasmids containing the target sequence were used as positive 
controls. Fluorescence signals from these reactions were captured at the end of extension 

25 step in each cycle. To determine the specificity of the assay, PCR products (184 base pairs) 
were subjected to a melting curve analysis at the end of the assay (65°C to 95°C, 0.1 °C per 
second). 

6.8 Detection of N-gene of hSARS Virus in Patients 

30 6.8.1 RT-PCR diagnosis protocol for coronavirus in SARS patients 

Equipment required (for 96 samples): 
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1 x SV Total RNA Isolation system 

2 x Mega titer plate 
3'x 96-well PCR plate 

1 x 0.5-10 ul multi- channel pipette 
5 1 x 10-100 ul multi-channel pipette 

1 x 20-200 jal multi-channel pipette 
1 x vacuum pump 

1 x swing-bucket rotor with mierotest plate buckets 

2 x PCR machine (96-well plate compatible) 
10 1 x Gel electrophoresis apparatus 

Station 1* - clinical samples handling (1 medical officer/ clinical technician) 

• Aliquot 500 |il sample in viral transport medium (containing, per liter, 2g of sodium 
bicarbonate, 5 g of bovine seaim albumin, 200 ug of vancomycin, 18 ug of amikacin, 

15 and 160 U of nystatin in Earle's balanced salt solution) from each individual vial 

into a well of 96-well mega titer plate containing 500 ul lysis buffer (lx) containing 
100 ul PK-15 cell (ATCC CCL-33; 5.0xl0 5 cell/ml) in complete minimum essential 
medium with Earle's salt (EMEM, Invirtogen) as internal control.** 

• Mix the lysate by pipetting up-and-down 3 times 
20 • Proceed to Station 2. 

* Station 1 should be carried out inside Class III biological safety cabinet. 

** At least two negative samples should be included in a 96-well platform as a negative 

control. 

25 

Station 2 - Total RNA extraction (1 laboratory technician) 

• Set up the Vacuum Manifold unit. Place the binding plate onto the Manifold Base. 

• Transfer the lysate from mega titer plate to each well of the S V 96 Binding Plate 
(binding plate). 

30 o Apply vacuum until the lysate passes through the binding plate. Release vacuum. 

• Add 500 ul of SV RNA Wash Solution (wash solution) to each well of binding plate, 
a Apply vacuum until the wash solution passes through binding plate. Release vacuum. 
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• Prepare DNase incubation mix for an entire 96-well plate as below: 

Yellow Core Buffer 2 ml 

0.09MMnCl 2 250 pi 

DNase I 250 ul 

o Apply 25 pi freshly prepared DNase incubation mix directly to the membrane of the 

binding plate, 
o Incubate at 20-25°C for 10 minutes. 

e> Add 200 pi of SV DNase Stop Solution to each well of the binding plate, 
o Apply vacuum until the S V DNase Stop Solution passes through the binding plate, 
Release vacuum. 

• Add 500 pi wash solution to each well of the binding plate. 

• Apply vacuum until wash solution passes through the binding plate. Turn off 
vacuum. 

• Spin the binding plate at 3000 xg for 30 seconds to remove residue wash solution. 

• Transfer the binding plate on top of a 96-well RT plate. 

• Add 50 pi nuclease-free water into each well of the binding plate to elute RNA. 

• Incubate at room temperature for 1 minute. 

• Spin the binding plate at 3000 xg, 4°C for 1 minute. 

• Collect eluted RNA in the 96-well RT plate. 

• Add 5 pi of 3 M sodium acetate and 200 pi of 95% ethanol into each well of the 
plate. 

• Place the RT plate on ice and incubate for 1 5 minutes. 

• Spin the plate at 3 000 xg, 4°C, 1 5 minutes. 

• Discard supernatant by inverting the plate and blotting on a clean paper towel. 

• Wash the pellet with 200pl of 70% ethanol. 

• Spin the plate at 3 OOOxg, 4°C, 1 0 minutes. 

• Discard supernatant by inverting the plate and blotting on a clean paper towel. 

• Air-dry the pellet for 5 minutes. 

<d Add 12 pi of nuclease-free water into each well. 

• Vortex the plate briefly to dissolve the pellet (for an example result, see Fig. 1 8). 
o Proceed to Station 3. 
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Station 3 - Reverse transcription (1 laboratory technician) 

• Prepare RT master mix for an entire 96-well plate in a 1.5-ml tube as below (100 
reactions): 

5 





.Per Reaction 


xlOO 


Random hexamers, 3pg/pl 


0.05 pi 


5 ill 


DNTPs, lOmM 


lul 


100 pi 


First-strand buffer, 5x 


4 pd 


400 |il 


DTT, 0.1 M 


2 pi 


200 ul 


Superscript II, 200U/pl 


1 pi 


100 ul 


Total 


8.05 pi 


805 pi 



• Aliquot 100 pi RT mix into 8 wells of a clean 96-well master mix plate. 

1 5 • From this plate, transfer 8.05 pi RT mix to each well of RT plate containing 12 pi 

RNA, mix by pipetting up-and-down for 3 times with a multi channel pipette. 
REPLACE TIP AFTER EACH TRANSFER. 

• Incubate the samples at 42°C for 50minutes followed by 70°C for 15 minutes. 

• Proceed to Station 4. 

20 

Station 4 - N-gene specific PCR (1 laboratory technician) 

■ Prepare PCR master mix for an entire 96-well plate in two 2059 culture tubes as 
below (100 reactions): 

N-specific PCR Control PCR 



25 Per 


25 pi Reaction 


xlOO 


Per 25 pi Reaction 


xlOO 


mQH 2 0 


18.65 pi 


1865 |^1 


17.65 pi 


1765 pi 


lOxPCR buffer 


2.5 pi 


250 pi 


2.5 pi 


250 pi 


25mMMgC12 


1.5 pi 


150 pi 


2.5 pi 


250 pi 


lOmMdNTPs 


0.25 pi 


25 pi 


0.25 pi 


25 pi 


30 Forward primer 10 pM 


0.5 pi 


50 pi 


0.5 pi 


50 pi 


Reverse primer 10 uM 


0.5 pi 


50 pi 


0.5 pi 


50 pi 


AmpliTaq Gold® 500 U 


0.1 pi 


10 pi 


0.1 pi 


10 pi 
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Template DNA Ul = LjJJ r 

Total 25 ul 2400 ul 25 ul 2400 ul 

• N-gene specific PGR and control PCR are performed in two individual PGR plates. 
5 o Aliquot 290 ul PCR master mix into the first column of a 96-well PCR plate. 

o From the first column, aliquot 24 ul of master mix into each well of PCR plate, 
e Transfer 1 ul of cDNA template (from station 4) into each well of PCR plate, 
o Mix by pipetting up-and-down for 3 times with a multi-channel pipette. REPLACE 
TIP AFTER EACH TRANSFER. 
1 0 • Seal the plate with sealing tape. 

• Perform the following reaction in two 96-well P CR machines : 

N- gene specific PCR Control PGR 

94°C 10 minutes 94°C 10 minutes 

15 94° C 30 seconds "1 94°C 30 seconds'! 

56°C 30 seconds J 40 cycles 55°C 30 seconds j 35 cycles 

72°C 30 seconds 72°C 45 seconds 

72°C 10 minutes 72°C 10 minutes 

20 Station 5 - Gel electrophoresis (1 laboratory technician) 

• Mix 5 ul of N-gene specific PCR product and 5 \xl control PCR product with 1 \xl 
bromophenol blue loading dye 

• Load the samples into the wells of a 2% agarose gel. 

• Electrophoresize the PCR products at 1 40V, 250mA for 3 0 minutes. 
25 • Stain the gel with ethidium bromide. 

• Visualize the products with UV and record the result. 

6.8.2 Using primers of SEQ ID NOS:2480 and 2481 

RT-PCR diagnostic protocol was performed as described in Section 6.8. 1 with some 
30 modifications. 
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RNA isolation from clinical samp les 

Clinical samples including nasopharyngeal aspirates (NPA) and stool specimens 
were provided by the Department of Microbiology, The University of Hong Kong. In 
addition, tracheal dispersion and lung biopsy from an index patient A described in New 
5 Engl, J. Med. 348: 1967-76 (by Drosten C.S., et al., 2003) at three time points was also 
collected. Sample collection was conducted from April 1 to April 28, 2003 in local 
hospitals. Method of sample collection was described in the previous section (also see, 
Poon et aL, 2003, Clinical Chemistry 49:953-955). Total RNA extraction from clinical 
samples was carried out with SV96 Total RNA Isolation System (Promega, WI, USA), with 

10 following modifications from manufacturer' s protocol. Five-hundred (500) ul of 
NP A/stool sample in viral transport medium (containing, per liter, 2 g of sodium 
bicarbonate, 5 g of bovine serum albumin, 200 ug of vancomycin, 18 pg of amikacin, and 
1 60 U of nystatin in Earle ' s balanced salt solution) was mixed with equal volume of SV 
RNA Lysis Buffer containing 100 ul of pig kidney epithelial (PK-15) cell (ATCC CCL-33; 

1 5 5.0xl0 5 cells/ml) in complete minimum essential medium with Earle' s salt (EMEM, 
Invitrogen) as internal control. The mixture was transferred to the wells of the SV 96 
Binding Plate. After washing with 500 ul of SV RNA Wash Solution prior to elution step, 
the plate was spun at 3000 xg for 30 seconds to remove residue wash solution. RNA was 
then eluted with 50 ul of nuclease-free water, and was collected in a clean 96-well PCR 

20 plate by spinning the plate at 3000 xg for 1 minute. Eluted RNA was then concentrated by 
incubating on ice for 1 5 minutes, in the presence of 5 pi of 3 M sodium acetate and 200 pi 
of 95 % ethanol. After centrifugation at 3000 xg, 4°C for 15 minutes, RNA pellet was 
washed with 200 ul of 75% ethanol and dissolved with 12 pi of nuclease-free water. 
Extracted RNA was immediately reverse-transcribed to first-strand cDNA. 

25 

First-Strand cDNA Synthesis 

Reverse-transcription was performed with 200 U of Superscript® II reverse 
transcriptase (Invitrogen, USA) in a 20 ul reaction containing 0.15 ug of random hexamers, 
RT buffer (lx), 10 mM dithiothreitol (DTT) and 0.5 mM deoxynucleotide triphosphates 
3 0 (dNTPs). Reaction was carried out in Peltier Thermal Cycler (MJ Research) with the 
following conditions: 50 minutes at 42°C followed by 15 minutes at 70°C. 
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Polymerase Chain Reaction ( PCtt) 

Primers were designed according to complete SARS CoV genomic sequence of a 
local specimen HK-39 announced previously (accession no. AY278491). Forward primer 
(SRS251: 5'-GCAGTCAAGCCTCTTCTCG-3'; SEQ ID NO:2480, corresponding to nt 
5 28658-28676 of HK-39 SARS genome, i.e., CCTCC200303) and reverse primer (SRS252: 
5 '-GCCTC AGCAGCAGATTTC-3 SEQ ID NO:2481; corresponding to nt 28866-28883 
of HK-39 SARS genome) amplified a 225 bp fragment from the region of N-gene that 
showed no homology to other coronavirus. Primers amplifying RNA-dependent RNA 
polymerase (lb gene) were used as parallel control (coro3: 5 ' -T AC AC ACCTC AGCGTTG- 

10 3' (SEQ ID NO:3), corresponding to nt 18041-18057; and coro4: 5'- 

CACGAACGTGACGAAT-3 ' (SEQ ID NO:4), corresponding to nt 18207-18222, 
Department of Microbiology, the University of Hong Kong). Both amplicons were cloned 
into same pCR2. 1 cloning vector (Fig. 17). Serially diluted plasmid was then used to 
determine the dynamic range and optimal condition of the PCRs (Fig. 21A and 21B). 

1 5 Another set of primer that amplifying a 745 bp fragment from pig p-actin gene was 
employed as an internal control for the diagnostic PCR assay (actin-F: 5' - 
TGAGACCTTCAACACGCC-3 ' (SEQ ID NO:2482); and actin-R: 5 ' - 
ATCTGCTGGAAGGTGGAC-3 ' (SEQ ID NO:2483)). 

Conventional PCR and gel electrophoresis was carrid out as preliminary experiment. 

20 Briefly, 1 pi of cDNA from clinical samples was amplified with 0.5 U Taq DNA 

polymerase recombinant (Invitrogen Life Technologies) in a 25-ul reaction containing PCR 
buffer (lx), 1 .5 mM MgCl 2 , O.lmM dNTPs and 0.5 pmol of each forward and reverse 
primers. Reaction was performed in Peltier Thermal Cycler (MJ Research) with the 
following conditions: 3 minutes at 94 °C, followed by 50 cycles of 94°C for 10 seconds, 56 

25 °C for 10 seconds, 72 °C for 10 seconds, and a 10-minute final extension step at 72 °C. 

Amplicons were analyzed with 2 % agarose gel electrophoresis (Fig. 23). Quantitative real- 
time PCR using SYBR® SYBR® green fluorophore was performed in diagnosis of clinical 
samples. In a 25 ul reaction, 1 pi cDNA template was mixed with 12.5 pi (2x) Green PCR 
Master Mix (Applied Biosystems) and 0.5 pmol of each forward and reverse primer. 

30 Volume of the reaction was adjusted to 25 pi with distilled water. Reactions were 

performed in the iCycler iQ Real-Time PCR Detection System (Bio-Rad) under the same 
condition as the conventional PCR. Fluorescence signals (FAM, excitation = 490 nm, 
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emission = 530 nm) were collected at the end of each extension step during the PCR cycles 
(Fig. 22 A). Threshold cycle (Ct) of each sample was determined using maximum curvature 
approach. Melting curve analysis was performed after 10 minutes final extension (Fig. 
22B). cDNA from non-SARS patients, including patients suffering from adenovirus (n = 5), 
5 repiratory syncytial virus (n = 5) 3 human metapneumovirus (n = 5), influenza A virus (n = 
5), or influenza B virus (n = 5) infection, were used as negative controls for the assay. 

Northern Blot Analysis 

SARS-CoV HK-39 strain infected Vero cell was provided by Department of 
1 0 Microbiology, the University of Hong Kong. Total RNA was extracted from the cell with 
TRIzol® reagent (Invitrogen Life Technologies) according to the manufacturer' s protocol. 
Eight (8) pg of total RNA was separated by electrophoresis on a 1% agarose gel containing 
3.7% formaldehyde. RNA was transferred to a positively charged nylon membrane Roche 
Diagnostic Corporation) by capillary blotting and fixed by UV cross-linking. cDNA 
15 synthesized with the same RNA sample was used as template for probe synthesis. Four 
pairs of primers amplifying fragments from lb (nt 18057 - 18222; SEQ ID NO:2484), S (nt 
21920 - 22107; SEQ ID NO:2485), M (nt 26867 - 26996; SEQ ID NO:2486) and N (nt 
28658 - 28883; SEQ ID NO:2487) gene were used in probe synthesis. DIG-labeling of 
probes, hybridization and detection of bands were performed with the digoxigenin system 
20 according to the manufacturer's procedures (Roche Molecular Biochemcials). Signals were 
then analysed with chemiluminescence (Fig. 24). 

Results and Discussion 

A large-scale RT-PCR assay provides a rapid means in monitoring and screening of 
25 SARS suspects. The result can be used to complement clinical diagnostic evaluation. In 
order to achieve a diagnostic purpose, the assay should be reliable and its accuracy should 
be assured so as to prevent occurrence of both false negative and false positive results. 
However, accuracy of the test may be influenced by several factors. A common technical 
problem with PCR is a failure of amplification due to the presence of PCR inhibitors {see 
30 Fig. 21). 

These PCR inhibitors included heme compounds found in blood, aqueous and 
vitreous humors, heparin, EDTA urine and polyamines (Fredricks eta!.., 1998, J. Clin. 
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Micro. 36:2810-16). Currently, NPA or stool samples were collected into transport medium 
to maintain the viability of the viral particles. RT-PCR was inhibited when total RNA 
extracted was used directly for first-strand cDNA synthesis without any treatment (25 out of 
27 samples) in preliminary experiment. However, after a simple ethanol precipitation step, 
5 the amplification of DMA could be retained (Fig. 19). Same result was obtained by either 
using SV or S V96 total RNA Isolation System (data not shown). It demonstrated that some 
components either in the medium or NPA/ stool samples would affect the downstream 
processes of the diagnosis test. 

In addition, current sample collection procedure dilutes the virus titer in the samples, 

10 especially during early stage of infection, in which the virus titer is low in nasal and throat 
swab specimens (Drosten et al, 2003, New England Journal of Medicine, on-line at 
http://content.nejm.org/cgi/reprint/NEJMoa030747v2). It was suggested that the sensitivity 
of PCR tests for SARS depended on the quality of the specimen and the time of testing 
during the course of the illness. In order to increase sensitivity of the test, total RNA 

1 5 isolated from clinical samples was concentrated prior to 1 st strand cDNA synthesis. 

In order to avoid false negative PCR results due to failure in the process of RNA 
isolation and 1st strand cDNA synthesis, total RNA was extracted from clinical samples in 
parallel with PK-15 mammalian cells. Figure 23 showed the RT-PCR screening result on 
48 clinical samples, including both NPA and stool samples. Diagnostic PCR was 

20 performed in parallel with (3-actin PCR. All samples were positive in )3-actin PCR. The 
result indicated that RNA and cDNA could be extracted and synthesized successfully from 
the samples in a single-step protocol as disclosed herein. With this internal control, total 
RNA isolation and cDNA synthesis from the samples were ensured, which eliminated false 
negative that resulted from failure in either one of the above processes. Moreover, 96-well 

25 assay format currently developed can be adopted into a high-throughput screening protocol, 
with which we are able to obtain diagnostic result of more than 90 clinical samples in 3 
hours with 1 clinical personnel, while the current existing protocol, in which samples are 
proceeded in individual tubes, can only handle about 30-50 samples a day per technician. 
Real-time quantitative PCR assay is more sensitive than conventional agarose gel- 

30 electrophoresis-associated PCR assay (Poon et al, J. Clin. Virol. 28:233-8) and therefore 
employed for SARS-CoV diagnosis purpose. Positive signals were detected in 38 of 136 
randomly selected clinical samples in both N-gene and lb-gene specific PCR. Among these 
38 positives, 3 were stool samples (2,2%) and 35 were NPA samples (25.7%). Detection 
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rate of the assay employing N-gene specific RT-PCR at different time points was shown in 
Table 2. 

Table 2 

~Date of onset No. of sample No. of positive Detection rate (%) 

1-2 15 2 13.3 

3-4 17 4 23.5 

5-6 15 4 26.7 

7-8 13 5 38.5 

9-10 9 4 44.4 
Negative control *| 9 All negative 



10 



1 5 Affirmative of these 3 8 positive cases was confirmed by melting curve analysis of 

PCR products. Specific melting temperature of N gene and lb gene PCR products (85.5°C 
and 80.5°C, respectively) indicated that the target framgments were amplified in the reaction. 
Specificity of the assay was also validated with non-SARS patients samples, including 
patients suffering from adenovirus (n= 5), repiratory syncytial virus (n = 5), human 

20 metapneumovirus (n = 5), influenza A virus (n = 5) and influenza B virus (n = 5). The 
result shows that all of these samples were negative in the assay (Fig. ??). These results 
indicate that the N-gene specific RT-PCR assay is specific for SARS-CoV diagnosis. 

Furthermore, we also demonstrated that the N-gene specific PCR was more sensitive 
than that of PCR amplifying lb RNA polymerase gene. Amplification conditions for both 

25 PCR assays were optimized (see Fig. 22) first with the plasmid construct containing 1 : 1 

ratio of lb- and N-gene fragment (see Fig. 20). Dynamic range of N-gene specific PCR was 
obtained (Fig. ??) and it was found to be with lower Ct values than that of lb-specific PCR. 
This revealed that N-gene specific PCR could achive higher amplification efficiency than 
lb-gene specific PCR when using same copy number of template. PCR with cDNA from 

30 clinical samples or virus infected Vero cells were then performed. Figure 22A shows the Ct 
and half-maximal values of the fluorescent signal of N gene and lb gene specific PCR 
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generated from NPA, tracheal dispersion and lung biopsy from patient A. The results 
indicated that fluorescent signals given in N gene specific PCR are higher (26.0% in 
average, ranged from 6.3-60%) than that of lb specific PCR in all positive samples. 
Furthermore, Ct values of N gene specific PCR are lower (0.1-4.6 cycles) than that of lb 
specific PCR among most of the SARS-CoV positive samples (Table 3). 

Table 3 



5S8S1 27.1 27.S 0.6 
55751 17£ 27.7 0.1 



34SS2 315 33.7 
3281,1 3^6 .H1.X 
33S35 SSS M.5 



45971 
.'I5972 

45m 

051 '15 



31.9 33.2 1.3 
43.S 48.2 4.6 



27.2. 27.7 8.5 



56851 2S» 26.1 2.2 
60013. 2-1.3 26.1 1.3 



31.4 31.7 0.3 



IF 123 28.7 29.4 0.7 



353 35.6 0.3 

7*538 28.4 27J -0.9, 

3116 30.0 33.5 3.5 

BUS S&.7 37.7 ijO 



68187 40.3 41.5 1.2 
mm 355 371, 2.1 



34JS 35.3 0.5 



iSTrt'S 52.1 34.5 2.1 

6S798 2&8 28M -CM 

6SK-3 3-1.6 383 3.7 

iSSOl 31.0 32:8 O.0 

70SS2 40^ .BJ 3.1 

705W 35.5 3^2 X7 



ACt = 1.49 ± 0.47, 95% confidence intervals = 0.74 to 2.23 (F-test) 

1 0 Statistic analysis indicates that Ct of N-gene PCR assay is significantly lower than 

that of ab-gene assay (95% confidence interval = 0.74 to 2.23, F-test). Stronger 
fluorescence signals and lower Ct values of N gene specific PCR provide a more sensitive 
diagnostic result and much target for the assay. 

Using cDNA from SARS-CoV infected Vero cells, amplification curves shown in 

1 5 Fig. 2 1 B show the differences between N gene and 1 b gene specific PCR. Ct of the N gene 
and lb gnene specific PCR was 35.3 and 37.8, respectively. This phenomenon had two 
main causes: (1) Expression level of N gene was higher than that of lb gene; and; (2) Copy 
number of N gene was much larger than that of lb gene because each transcript preceded a 
copy of N gene, in SARS-CoV infected cells. Northern blot analysis supported this 

20 hypothesis (Fig. 24). When N-gene specific PCR product was used as a probe, at least five 
transcripts from the vims were hybridized and gave positive signals (Figure 24). This result 
agreed with the findings in which five subgenomic mRNAs were detected by Northern 
hybridization of RNA from SARS-CoV infected cells using a probe derived from the 3' 
untranslated region (Rota et al, 2003, Science 300:1394-99). On the other hand, when lb 

25 PCR product was used as a probe, only 2 transcripts with high molecular size were 

hybridized, demonstrating that the copy number of N gene was much higher than that of lb 
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gene, during transcription and gene expression in the host cells. The Northern hybridization 
result strongly supports the conclusion that PCR amplifying regions in N gene of the SARS- 
CoV are more sensitive than other regions as a target for diagnostic screening. It is possible 
that amplification of more than one genome region may increase the specificity of the test 
5 (Yam W.C., et a!., 2003, J. Clin. Microbiol 41:4521-24). 

In conclusion, we have developed a new generation of RT-PCR diagnosis test which 
is more sensitive than conventional diagnostic test for the detection of the coronavirus 
associated with SARS. The assay provides a high throughput, highly sensitive screening 
platform, which enables us to scale up to test hundreds of thousands of suspected SARS 
10 cases each day in a single working line. Incorporation of PK-15 cell as an internal control 
in the assay and use of N gene as a diagnosis locus in addition to lb gene can enhance the 
sensitivity and accuracy of the test. We are adapting the protocol to 96-well real-time 
quantitative PCR and sequencing format to shorten the time required for the test and to 
obtain information on genotypic variation of the virus. 

15 

CLINICA1 RESULTS 
Clinical findings; 

All 50 patients with SARS were ethnic Chinese. They represented 5 different 
epidemiologically linked clusters as well as additional sporadic cases fitting the case 

20 definition. They were hospitalized at a mean of 5 days after the onset of symptoms. The 

median age was 42 years (range of 23 to 74) and the female to male ratio was 1.3. Fourteen 
(28%) were health care workers and five (10%) had a history of visit to a hospital 
experiencing a major outbreak of SARS. Thirteen (26%) patients had household contacts 
and 12 (24%) others had social contacts with patients with SARS. Four (8%) had a history 

25 of recent travel to mainland China. 

The major complaints from most patients were fever (90%) and shortness of breath. 
Cough and myalgia were present in more than half the patients (Table 4). Upper respiratory 
tract symptoms such as rhinorrhea (24%) and sore throat (20%) were present in a minority 
of patients. Diarrhea (10%) and anorexia (10%) were also reported. At initial examination, 

30 auscultatory findings, such as crepitations and decreased air entry, were present in only 38% 
of patients. Dry cough was reported by 62% of patients. All patients had radiological 
evidence of consolidation, at the time of admission, involving 1 zone (in 36), 2 zones (13) 
and 3 zones (1). 
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Clinical symptoms Number ( p ei cent ige I 

~Fever 50 (100%) 

Chill or rigors 37(74%) 

Cough 31(62%) 

Myalgia 27 (54%) 

Malaise 25 (50%) 

Running nose 12 (24%) 

Sore throat 10(20%) 

Shortness of breath 1 0 (20%) 

Anorexia 10 (20%) 

Diarrhea 5 (10%) 

Headache 10 (20%) 

^Dizziness 6 (12%) 

* Truncal maculopapular rash was noted in 1 patient, 



5 In spite of the high fever, most patients (98%) had no evidence of a leukocytosis. 

Lymphopenia (68%), leucopenia (26%), thrombocytopenia (40%) and anemia (18%) were 
present in peripheral blood examination (Table 5). Parenchymal liver enzyme, alanine 
aminotransferase (ALT) and muscle enzyme, creatinine kinase (CPK) were elevated in 34% 
and 26% respectively. 
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Table 5 



Laboratory parameter 


Mean (range) 


Percentage of abno 


rmal Normal range 


Haemoglobin 


12.9(8.9-15.9) 




11.5- 16.5 g/dl 


Anaemia 




9 (18%) 




White cell count 


5.17(1.1-11.4) 




4-llxl0 9 /L 


Leucopenia 




13 (26%) 




Lymphocyte count 


0.78 (0.3 - 1.5) 




1.5-4.0xlO'/L 


Significant lymphopenia 




34(68%) 




(<1.0xlQ 9 /L) 








Platelet count 


174 (88-351) 




150 -400xl0 9 /L 


Thrombocytopenia 




20 (40%) 




Alanine aniinotraiisaniinase (ALT) 


63 (11 -350) 




6 - 53 U/L 


Elevated ALT 




17 (34%) 




Albumin 


37 (26 - 50) 




42 - 54 g/L 


Low albumin 




34 (68%) 




Globulin 


33 (21 -42) 




24 - 36 g/L 


Elevated globulin 




10 (20%) 




Creatinine kinase 


244 (31 -1379) 




34 - 138 U/L 


Elevated creatinine kinase 




13 (26%) 





Routine microbiological investigations for known viruses and bacteria by culture, 
antigen detection, and PCR were negative in most cases. Blood culture was positive for 
5 Escherichia coli in a 74-year-old male patient, who was admitted to intensive care unit, and 
was attributed to hospital acquired urinary tract infection. Klebsiella pneumoniae and 
Hemophilus influenzae were isolated from the sputum specimens of 2 other patients on 
admission. 

Oral levofloxacin 500 mg q24h was given in 9 patients and intravenous (1 .2 g q8h)/ 
10 oral (375 mg tid) amoxicillin-clavulanate and intravenous/oral clarithromycin 500 mg ql2h 
were given in another 40 patients. Four patients were given oral oseltamivir 75 mg bid. In 
one patient, intravenous ceftriaxone 2 gm q24h, oral azithromycin 500 mg q24h, and oral 
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amantadine 100 mg bid were given for empirical coverage of typical and atypical 
pneumonia. 

Nineteen patients progressed to severe disease with oxygen desaturation and were 
required intensive care and ventilatory support. The mean number of days of deterioration 
5 from the onset of symptoms was 8.3 days. Intravenous ribavirin 8 mg/kg qSh and steroid 
was given in 49 patients at a mean day of 6.7 after onset of symptoms. 

The risk factors associated with severe complicated disease requiring intensive care 
and ventilatory support were older age, lymphopenia, impaired ALT, and delayed initiation 
of ribavirin and steroid (Table 6). All the complicated cases were treated with ribavirin and 
1 0 steroid after admission to the intensive care unit whereas all the uncomplicated cases were 
started on ribavirin and steroid in the general ward. As expected, 3 1 uncomplicated cases 
recovered or improved whereas 8 complicated cases deteriorated with one death at the time 
of writing. All 50 patients were monitored for a mean of 12 days at the time of writing. 
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Table 6 





Complicated 


Uncomplicated 


P value 




case 


case 






(n= 19) 


(n=31) 




Mean (SD) age (range) 


49.5 ± 12.7 


39.0 + 10.7 


P<0.01 


Male / Female ratio 


8/11 


14 / 17 


N.S. 


Underlying illness 


5* 


it 


P < 0.05 


Mode of contact 








Travel to China 


1 


3 


N.S. 


Health care worker 


5 


9 


N.S. 


Hospital visit 


1 


4 


N.S. 


Household contact 


S 


5 


P<0.05 


Social contact 


4 


10 


N.S. 


Mean (SD) duration of symptoms to 


5.2 ±2.0 


4.7 + 2.5 


N.S. 


admission (days) 








Mean (SD) admission temperature (°C) 


38.8 + 0.9 


38.7 + 0.8 


N.S. 


Mean (SD) initial total peripheral WBC 


5.1 +2.4 


5.2+ 1.8 


N.S. 


count (x 10 9 /L) 








Mean (SD) initial lymphocyte count 


0.66 + 0.3 


0.85 ± 0.3 


P <0.05 


(xl0 9 /L) 








Presence of thrombocytopenia 
(<150xl0 9 /L) 


8 


12 


N.S, 








Impaired liver function test 


11 


6 


P <0.01 


CXR changes (number of zone affected) 


1.4 


1.2 


N.S. 


Mean (SD) day of deterioration from the 


8.3 + 2.6 


Not applicable 




onset of symptoms § 








Mean (SD) day of initiation of Ribavirin 


7.7 + 2.9 


5.7 ±2.6 


P<0.05 


& steroid from the onset of symptoms 








Initiation of ribavirin & steroid after 


12 


0 


P< 0.001 


deterioration 








Response to ribavirin & steroid 


11 


28 


P < 0.05 


Outcome 








Improved or recovered 


10 


31 


P<0.01 


Not improving || 


8 


0 


P<0.01 



* Multi-variant analysis is not performed due to low number of cases; 



t 2 patients had diabetic mellitus, 1 had hypertrophic ostructive cardiomyopathy, 1 
5 had chronic active hepatitis B, and 1 had brain tumour; 

* 1 patient had essential hypertension; 
§ desaturation requiring intensive care support; 
|| 1 died. 
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Two virus isolates, subsequently identified as a member of Coronaviridae (see 
below), were isolated from two patients. One was from an open lung biopsy tissue of a 53- 
year-old Hong Kong Chinese resident and the other from a nasopharyngeal aspirate of a 42 
year-old female with good previous health. The 53 -year old male had a history of 10-hour 
5 household contact with a Chinese visitor who came from Guangzhou and later died from 
SARS. Two days after this exposure, he presented with fever, malaise, myalgia, and 
headache. Crepitations were present over the right lower zone and there was a 
corresponding alevolar shadow on the chest radiograph. Hematological investigation 
revealed lymphopenia of 0.7 x 10 9 /1 with normal total white cell and platelet counts. Both 
10 ALT (41 U/L) and CPK (405 U/L) were impaired. Despite a combination of oral 

azithromycin, amantadine, and intravenous ceftriaxone, there was increasing bilateral 
pulmonary infiltrates and progressive oxygen desaturation. Therefore, an open lung biopsy 
was performed 9 days after admission. Histopathological examination showed a mild 
interstitial inflammation with scattered alveolar pneumocytes showing cytomegaly, granular 
1 5 amphophilic cytoplasm and enlarged nuclei with prominent nucleoli. No cells showed 
inclusions typical of herpesvirus or adenovirus infection. The patient required ventilation 
and intensive care after the operative procedure. Empirical intravenous ribavirin and 
hydrocortisone were given. He succumbed 20 days after admission. In retrospect, 
coronavirus-like RNA was detected in his nasopharyngeal aspirate, lung biopsy and post- 
20 mortem lung. He had a significant rise in titer of antibodies against his own hSARS isolate 
from 1/200 to 1/1600. 

The second patient from whom a hSARS virus was isolated, was a 42-year-old 
female with good past health. She had a history of travel to Guangzhou in mainland China 
for 2 days. She presented with fever and diarrhea 5 days after her return to Hong Kong. 
25 Physical examination showed crepitation over the right lower zone which had a 

corresponding alveolar shadow on the chest radiograph. Investigation revealed leucopenia 
(2.7 x 10 9 /L), lymphopenia (0.6 x 107L), and thrombocytopenia (104 x 10 9 /L). Despite the 
empirical antimicrobial coverage with amoxicillin-clavulanate, clarithromycin, and 
oseltamivir, she deteriorated 5 days after admission and required mechanical ventilation and 
30 intensive care for 5 days. She gradually improved without receiving treatment with 

ribavirin or steroid. Her nasopharyngeal aspirate was positive for the virus in the RT-PCR 
and she was seroconverted from antibody titre <l/50 to 1/1600 against the hSARS isolate. 
Virolpgical finding s: 
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Viruses were isolated on FRhk-4 cells from the lung biopsy and nasopharyngeal 
aspirate respectively, of two patients described above. The initial cytopathic effect 
appeared between 2 and 4 days after inoculation, but on subsequent passage, cytopathic 
effect appeared in 24 hours. Both virus isolates did not react with the routine panel of 
5 reagents used to identify virus isolates including those for influenza A, B parainfluenza 
types 1,2,3, adenovirus and respiratory syncytial virus (DAKO, Glostrup, Denmark). They 
also failed to react in RT-PCR assays for influenza A and HMPV or in PGR assays for 
mycoplasma. The virus was ether sensitive, indicating that it was an enveloped virus. 
Electron microscopy of negatively stained (2% potassium phospho-tungstate, pH 7.0) cell 

1 0 culture extracts obtained by ultracentrifugation showed the presence of pleomorphic 
enveloped viral particles, of about 80-90 nm (ranging 70-130 nm) in diameter, whose 
surface morphology appeared comparable to members of Coronaviridae (Figure 5a). Thin 
section electron microscopy of infected cells revealed virus particles of 55-90 nm diameter 
within the smooth-walled vesicles in the cytoplasm (Figure 5b). Virus particles were also 

15 seen at the cell surface. The overall findings were compatible with infections in the cells 
caused by viruses of Coronaviridae. 

A thin section electron micrograph of the lung biopsy of the 53 year old male 
contained 60-90-nm viral particles in the cytoplasm of desquamated cells. These viral 
particles were similar in size and morphology to those observed in the cell-cultured virus 

20 isolate from both patients (Figure 4). 

The RT-PCR products generated in a random primer RT-PCR assay were analyzed 
and unique bands found in the virus infected specimen was cloned and sequenced. Of 30 
clones examined, a clone containing 646 base pairs (SEQ ID NO: 1) of unknown origin was 
identified. Sequence analysis of this DNA fragment suggested this sequence had a weak 

25 homology to viruses of the family of Coronaviridae (data not shown). Deducted amino 

acid sequence (215 amino acids: SEQ ID NO:2) from this unknown sequence, however, had 
the highest homology (57%) to the RNA polymerase of bovine coronavirus and murine 
hepatitis virus, confirming that this virus belongs to the family of Coronaviridae. 
Phylogenetic analysis of the protein sequences showed that this virus, though most closely 

30 related to the group II coronaviruses, was a distinct virus (Figures 5a and 5b). 

Based on the 646 bp sequence of the isolate, specific primers for detecting the new 
virus was designed for RT-PCR detection of this hSARS virus genome in clinical 
specimens. Of the 44 nasopharyngeal specimens available from the 50 SARS patients, 22 
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had evidence of hSARS KNA. Viral RNA was detectable in 10 of 18 fecal samples tested. 
The specificity of the RT-PCR reaction was confirmed by sequencing selected positive RT- 
PCR amplified products. None of 40 nasophararyngeal and fecal specimens from patients 
with unrelated diseases were reactive in the RT-PCR assay. 
5 To determine the dynamic range of real-time quantitative PGR, serial dilutions of 

plasmid DNA containing the target sequence were made and subjected to the real-time 
quantitative PCR assay. As shown in Figure 7 A the assay was able to detect as little as 10 
copies of the target sequence. By contrast, no signal was observed in the water control 
(Figure 7A). Positive signals were observed in 23 out of 29 serologically confirmed SARS 

10 patients. In all of these positive cases, a unique PCR product (T m = 82°C) corresponds to 
the signal from the positive control was observed (Figure 7B, and data not shown). These 
results indicated this assay is highly specific to the target. The copy numbers of the target 
sequence in these reactions range from 4539 to less than 10. Thus, as high as 6.48 x 10 5 
copies of this viral sequence could be found in 1 ml of NPA sample. In 5 of the above 

15 positive cases, it was possible to collect NPA samples before seroconvertion. Viral RNA 
was detected in 3 of these samples, indicating that this assay can detect the virus even at the 
early onset of infection. 

To further validate the specificity of this assay, NPA samples from healthy 
individuals (n=l 1) and patients suffered from adenovirus (n=l 1), respiratory syncytial virus 

20 (n=l 1), human metapneumovirus (n=l 1), influenza A virus (n=13) or influenza B virus 

(n=l) infection were recruited as negative controls. All of these samples, except one, were 
negative in the assay. The false positive case was negative in a subsequence test. Taken 
together, including the initial false positive case, the real-time quantitative PCR assay has 
sensitivity of 79% and specificity of 98 %. 

25 Epidemiological data suggest that droplet transmission is one of the major route of 

transmission of this virus. The detection of live virus and the detection of high copies of 
viral sequence from NPA samples in the current study clearly support that cough and sneeze 
droplets from SARS patients might be the major source of this infectious agent. 
Interestingly, 2 out of 4 available stool samples form the SARA patients in this study were 

3 0 positive in the assay (data not shown). The detection of the virus in feces suggests that 
there might be other routes of transmission. It is relevant to note that a number of animal 
coronaviruses are spread via the fecal-oral route (Mcintosh K., 1974, Coronaviruses: a 
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comparative review. Current Top Microbiol Immunol. 63: 85-112). However, further 
studies are required to test whether the virus in feces is infectious or not. 

Currently, apart form this hSARS virus, there are two known serogroups of human 
coronaviruses (229E and OC43) (Hruskova J. ei al, 1990, Antibodies to human 
5 coronaviruses 229E and OC43 in the population ofC.R, Acta Virol 34:346-52). The 

primer set used in the present assay does not have homology to the strain 229E. Due to the 
lack of available corresponding OC43 sequence in the Geuebank, it is not known whether 
these primers would cross-react with this strain. However, sequence analyses of available 
sequences in other regions of OC43 polymerase gene indicate that the novel human virus 

10 associated with SARS is genetically distinct from OC43. Furthermore, the primers used in 
this study do not have homology to any of sequences from known coronaviruses. Thus, it is 
very unlikely that these primers would cross-react with the strain OC43 . 

Apart from the novel pathogen, metapneumo virus was reported to be identified in 
some of SARS patients (Center for Disease Control and Prevention, 2003, Morbidity and 

15 Mortality Weekly Report 52: 269-272). No evidence of metapneumovirus infection was 
detected in any of the patients in this study (data not shown), suggesting that the novel 
hSARS virus of the invention is the key player in the pathogenesis of SARS. 



Immunofluorescent antibody detection ; 

20 Thirty- five of the 50 most recent serum samples from patients with SARS had 

evidence of antibodies to the hSARS (see Fig. 3). Of 27 patients from whom paired acute 
and convalescent sera were available, all were seroconverted or had >4 fold increase in 
antibody titer to the virus. Five other pairs of sera from additional SARS patients from 
clusters outside this study group were also tested to provide a wider sampling of SARS 

25 patients in the community and all of them were seroconverted. None of 80 sera from 

patients with respiratory or other diseases as well as none of 200 normal blood donors had 
detectable antibody. 

When either seropositivity to HP-CV in a single serum or viral RNA detection in the 
NPA or stool are considered evidence of infection with the hSARS, 45 of the 50 patients 
30 had evidence of infection. Of the 5 patients without any virological evidence of 

Coronaviridae viral infection, only one of these patients had their sera tested > 14 days after 
onset of clinical disease. 
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DISCUSSION 

The outbreak of SARS is unusual in a number of aspects, in particular, in the 
appearance of clusters of patients with pneumonia in health care workers and family 
contacts. In this series of patients with SARS, investigations for conventional pathogens of 
5 atypical pneumonia proved negative. However, a virus that belongs to the family 

Coronaviridae was isolated from the lung biopsy and nasopharyngeal aspirate obtained 
from two SARS patients, respectively. Phylogenetically, the virus was not closely related to 
any known human or animal coronavirus or torovirus. The present analysis is based on a 
646 bp fragment (SEQ ID NO:l) of the polymerase gene, which indicates that the virus 

1 0 relates to antigenic group 2 of the coronaviruses along with murine hepatitis virus and 
bovine coronavirus. However, viruses of the Coronaviridae can undergo heterologous 
recombination within the virus family and genetic analysis of other parts of the genome 
needs to be carried out before the nature of this new virus is more conclusively defined 
(Holmes KV. Coronaviruses. Eds Knipe DM, Howley PM Fields Virology, 4th Edition, 

1 5 Lippincott Williams & Wilkins, Philadelphia, 1 1 87-1203). The biological, genetic and 

clinical data, taken together, indicate that the new virus is not one of the two known human 
coronaviruses. 

The majority (90%) of patients with clinically defined SARS had either serological 
or RT-PCR evidence of infection by this virus. In contrast, neither antibody nor viral RNA 

20 was detectable in healthy controls. All 27 patients from whom acute and convalescent sera 
were available demonstrated rising antibody titers to hSARS virus, strengthening the 
contention that a recent infection with this virus is a necessary factor in the evolution of 
SARS. In addition, all five pairs of acute and convalescent sera tested from patients from 
other hospitals in Hong Kong also showed seroconversion to the virus. The five patients 

25 who has not shown serological or virological evidence of hSARS virus infection, need to 
have later convalescent sera tested to define if they are also seroconverted. However, the 
concordance of the hSARS virus with the clinical definition of SARS appears remarkable, 
given that clinical case definitions are never perfect. 

No evidence of HMPV infection, either by RT-PCR or rising antibody titer against 

30 HMPV, was detected in any of these patients. No other pathogen was consistently detected 
in our group of patients with SARS. It is therefore highly likely that that this hSARS virus 
is either the cause of SARS or a necessary pre-requisite for disease progression. Whether or 
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not other microbial or other co-factors play a role in progression of the disease remains to 
be investigated. 

The family Coronaviridae includes the genus Coronavirus and Torovirus. They are 
enveloped RNA. viruses which cause disease in humans and animals. The previously 
5 known human coronaviruses, types 229E and OC43 are the major causes of the common 
cold (Holmes KV. Coronaviruses. Eds Knipe DM, Howley PM Fields Virology, 4th 
Edition, Lippincott Williams & Wilkins, Philadelphia, pp. 1187-1203). But, while they can 
occasionally cause pneumonia in older adults, neonates or immunocompromised patient 
(El-Sahly HM, Atmar RL, Glezen WP, Greenberg SB. Spectrum of clinical illness in 

10 hospitalizied patients with "common cold" virus infections. Clin Infect Dis. 2000; 31: 96- 
100; and Foltz EJ, Elkordy MA. Coronavirus pneumonia following autologous bone 
marrow transplantation for breast cancer. Chest 1999; 115: 901-905), Coronaviruses have 
been reported to be an important cause of pneumonia in military recruits, accounting for up 
to 30% of cases in some studies (Wenzel RP, Hendley JO, Davies JA, Gwaltney JM, 

15 Coronavirus infections in military recruits: Three-year study with coronavirus strains OC43 
and 229E. Am RevRespir Dis. 191 A; 109: 621-624). Human coronaviruses can infect 
neurons and viral RNA has been detected in the brain of patients with multiple sclerosis 
(Talbot PJ, Cote G, Arbour N. Human coronavirus OC43 and 229E persistence in neural 
cell cultures and human brains. Adv Exp Med Biol. - in press). On the other hand, a number 

20 of animal coronaviruses (eg. Porcine Transmissible Gastroenteritis Virus, Murine Hepatitis 
Virus, Avian Infectious Bronchititis Virus) cause respiratory, gastrointestinal, neurological 
or hepatic disease in their respective hosts (Mcintosh K. Coronaviruses: a comparative 
review. Current Top Microbiol Immunol. 1974; 63: 85-112). 

We describe for the first time the clinical presentation and complications of SARS. 

25 Less than 25% of patients with coronaviral pneumonia had upper respiratory tract 

symptoms. As expected in atypical pneumonia, both respiratory symptoms and positive 
auscultatory findings were very disproportional to the chest radiographic findings. 
Gastrointestinal symptoms were present in 10%. It is relevant that the virus RNA is detected 
in faeces of some patients and that coronaviruses have been associated with diarrhoea in 

30 animals and humans (Caul EO, Egglestone SI. Further studies on human enteric 

coronaviruses Arch Virol. 1977; 54: 107-17). The high incidence of deranged liver function 
test, leucopenia, significant lymphopenia, thrombocytopenia and subsequent evolution into 
adult respiratory distress syndrome suggests a severe systemic inflammatory damage 
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induced by this hSARS virus. Thus imrnuno-modulation by steroid may be important to 
complement the antiviral therapy by ribavirin. In this regard, it is pertinent that severe 
human disease associated with the avian influenza subtype H5N1, another virus that 
recently crossed from animals to humans, has also been postulated to have an immuno- 
5 pathological component (Cheung CY, Poon LLM, Lau ASY et al. Induction of 

proinflammatory cytokines in human macrophages by influenza A (H5N1) viruses: a 
mechanism for the unusual severity of human disease. Lancet 2002; 360: 183 1-1 S3 7). In 
common with H5N1 disease, patients with severe SAKS are adults, are significantly more 
lymphopenic and have parameters of organ dysfunction beyond the respiratory tract (Table 

10 4) (Yuen KY, Chan PKS, Peiris JSM, et al. Clinical features and rapid viral diagnosis of 

human disease associated with avian influenza A H5N1 virus. Lancet 1998; 351: 467-471). 
It is important to note that a window of opportunity of around 8 days exists from the onset 
of symptoms to respiratory failure. Severe complicated cases are strongly associated with 
both underlying disease and delayed use of ribavirin and steroid therapy. Following our 

15 clinical experience in the initial cases, this combination therapy was started very early in 
subsequent cases which were largely uncomplicated cases at the time of admission. The 
overall mortality at the time of writing is only 2% with this treatment regimen. There were 
still 8 out of 1 9 complicated cases who had not shown significant response. It is not 
possible to a detail analysis of the therapeutic response to this combination regimen due to 

20 the heterogeneous dosing and time of initiation of therapy. 

Other factors associated with severe disease is acquisition of the disease through 
household contact which may be attributed to a higher dose or duration of viral exposure 
and the presence of underlying diseases. 

The clinical description reported here pertains largely to the more severe cases 

25 admitted to hospital. We presently have no data on the full clinical spectrum of the 
emerging Coronaviridae infection in the community or in an out-patient-setting. The 
availability of diagnostic tests as described here will help address these questions. In 
addition, it will allow questions pertaining to the period of virus shedding (and 
communicability) during convalescence, the presence of virus in other body fluids and 

30 excreta and the presence of virus shedding during the incubation period, to be addressed. 

The epidemiological data at present appears to indicate that the virus is spread by 
droplets or by direct and indirect contact although airborne spread cannot be ruled out in 
some instances. The finding of infectious virus in the respiratory tract supports this 
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contention. Preliminary evidence also suggests that the virus may be shed in the feces. 
However, it is important to note that detection of viral RNA does not prove that the virus is 
viable or transmissible. If viable virus is detectable in the feces, this would be a potentially 
additional route of transmission that needs to be considered. It is relevant to note that a 
5 number of animal coronaviruses are spread via the fecal-oral route (Mcintosh K. 

Coronaviruses: a comparative review. Current Top Microbiol Immunol. 1974; 63: 85-112). 

7. DEPOSIT 

A sample of isolated hSARS virus was deposited with China Center for Type 
1 0 Culture Collection (CCTCC) at Wuhan University, Wuhan 430072 in China on April 2, 
2003 in accordance with the Budapest Treaty on the Deposit of Microorganisms, and 
accorded accession No. CCTCC- V2003 03. which is incorporated herein by reference in its 
entirety. 

15 8. MARKET POTENTIAL 

The hS ARS virus can now be grown on a large scale, which allows the development 
of various diagnostic tests as described hereinabove as well as the development of vaccines 
and antiviral agents that are effective in preventing, ameliorating or treating SARS. Given 
the severity of the disease and its rapid global spread, it is highly likely that significant 
20 demands for diagnostic tests, therapies and vaccines to battle against the disease, will arise 
on a global scale. In addition, this virus contains genetic information which is extremely 
important and valuable for clinical and scientific research applications. 

9. EQUIVALENTS 

25 Those skilled in the art will recognize, or be able to ascertain many equivalents to 

the specific embodiments of the invention described herein using no more than routine 
experimentation. Such equivalents are intended to be encompassed by the following claims. 

All publications, patents and patent applications mentioned in this specification are 
herein incorporated by reference into the specification to the same extent as if each 

30 individual publication, patent or patent application was specifically and individually 
indicated to be incorporated herein by reference. 
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WHAT IS CLAIMED: 

1. An isolated nucleic acid molecule consisting essentially of a nucleic acid sequence 
of SEQ ID NO.-2471, or a complement thereof. 

2. An isolated nucleic acid molecule consisting essentially of a nucleic acid sequence 
of SEQ ID NO:2473, or a complement thereof. 

3 . An isolated nucleic acid molecule which hybridizes under stringent conditions to a 
nucleic acid molecule having the nucleotide sequence of claim 1 or 2, or a complement 
thereof. 

4. The nucleic acid molecule of claim 1, 2 or 3 wherein the molecule is RNA. 

5. The nucleic acid molecule of claim 1, 2 or 3 wherein the molecule is DNA. 

6. An isolated nucleic acid molecule which hybridizes under stringent conditions to the 
nucleic acid molecule of claim 1 or 2, or a complement thereof wherein the nucleic acid 
molecule encodes an amino acid sequence which has a biological activity exhibited by a 
polypeptide encoded by the nucleotide sequence of SEQ ID NO:2471 or 2473. 

7. An isolated polypeptide encoded by the nucleic acid molecule of claim 1 or 2. 

8. An antibody or an antigen-binding fragment thereof which immuno specifically 
binds to the N-gene protein of a hSARS virus. 

9. An antibody or an antigen-binding fragment thereof which immuno specifically 
binds to the S-gene protein of a hSARS virus. 

10. The antibody of claim 8, 9, or an antigen-binding fragment thereof which neutralizes 
the hSARS virus. 

11. An antibody which immunospecifically binds to the polypeptide of claim 1, or an 
antigen-binding fragment of said antibody. 

12. A method for detecting the presence of a N-gene of the hSARS virus in a biological 
sample, said method comprising: 
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(a) contacting the sample with a compound that selectively binds to said N- 
gene; and 

(b) detecting whether the compound binds to said N-gene in the sample. 

13. The method of claim 12, wherein the compound that binds to said N-gene is a 
nucleic acid molecule comprising a nucleotide sequence having at least 5, 10, 15, 20, 25, 30, 
35, 40, 45, 50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 
700, 750, 800, 850, 900, 950, 1,000, 1,050, 1,100, 1,150 or 1,200 contiguous nucleotides of 
the nucleotide sequence of SEQ ID NO:2471, or a complement thereof. 

14. The method of claim 12, wherein the compound that binds to said N-gene is a 
nucleic acid molecule comprising a nucleotide sequence of SEQ ID NO:2475, 2476, 2480 
and/or 2481. 

15. A method for detecting the presence of a S-gene of the hSARS virus in a biological 
sample, said method comprising; 

(a) contacting the sample with a compound that selectively binds to said S- 
gene; and 

(b) detecting whether the compound binds to said S-gene in the sample. 

16. The method of claim 15, wherein the compound that binds to said S-gene is a 
nucleic acid molecule comprising a nucleotide sequence having at least 5, 10, 15, 20, 25, 30, 
35, 40, 45, 50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 
700, 750, 800, 850, 900, 950, 1,000, 1,050, 1,100, 1,150, 1,200, 2,000, or 3,000 contiguous 
nucleotides of the nucleotide sequence of SEQ ED NO:2473, or a complement thereof. 

17. The method of claim 15, wherein the compound that binds to said S-gene is a 
nucleic acid molecule comprising a nucleotide sequence of SEQ ID NO:2477 and/or 2478. 

18. A method for detecting the presence of the polypeptide of claim 7 in a sample, said 
method comprising: 

(a) contacting the sample with a compound that selectively binds to said 
polypeptide; and 
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(b) detecting whether the compound binds to said polypeptide in the sample. 

19. The method of claim 1 8, wherein the compound that binds to the polypeptide is an 

antibody. 

20. A method for identifying a subject infected with the hSARS virus, said method 
comprising: 

(a) obtaining total RNA from a biological sample obtained from the subject 

(b) reverse transcribing the total RNA to obtain cDNA; and 

(c) subjecting the cDNA to real-time PGR assay using a set of primers 
derived from a nucleotide sequence of the N-gene of the hSARS. 

2 1 . The method of claim 20, wherein the set of primers have nucleotide sequences of 
SEQ ID NOS:2475 and 2476, respectively. 

22. The method of claim 20, wherein the set of primers have nucleotide sequences of 
SEQ ID NOS:2480 and 2481, respectively. 

23 . A method for identifying a subject infected with the hS ARS virus, said method 
comprising: 

(a) obtaining total RNA from a biological sample obtained from the subject 

(b) reverse transcribing the total RNA to obtain cDNA; and 

(c) subjecting the cDNA to real-time PCR assay using a set of primers 
derived from a nucleotide sequence of the S-gene of the hSARS. 

24. The method of claim 23, wherein the set of primers have nucleotide sequences of 
SEQ ID NOS:2477 and 2478, respectively. 

25. A kit comprising in one or more containers one or more isolated nucleic acid 
molecules comprising a nucleotide sequence of SEQ ID NO: 2475 and/or SEQ ID NO:2476. 

26. A kit comprising in one or more containers one or more isolated nucleic acid 
molecules comprising a nucleotide sequence of SEQ ID NO:2480 and/or SEQ ID NO:2481. 

88 



WO 2004/085650 



PCT/CN2004/000246 



27. A kit comprising in one or more containers one or more isolated nucleic acid 
molecules comprising a nucleotide sequence of SEQ ID NO: 2477 and/or SEQ ID N0.2478. 

28. A kit comprising in one or more containers one or more antibodies of claim 8 or 9. 

29. A kit comprising in one or more containers one or more antibodies of claim 1 1 . 
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t aaa tgt agt aga ate ata cct gcg cgt gcg cgc gta gag tgt ttt gat 
Lys Cys 3er Arg He lie Pro Ala Arg Ala Arg Val Glu Cys Phe Asp 
15 10 15 

aaa ttc aaa gtg aat tea aca eta gaa cag tat gtt ttc tgc act gta 
Lys Phe Lys Val Asn Ser Thr Leu Glu Gin Tyr Val Phe Cys Thr Val 



aat gea ttg cca gaa aca act get gac att gta gtc ttt gat gaa ate 

Asn Ala Leu Pro Glu Thr Thr Ala Asp He Val Val Phe Asp Glu He 
35 40 45 

tct atg get act aat tat gac ttg agt gtt gtc aat get aga ctt cgt 

Ser Met Ala Thr Asn Tyr Asp Leu Ser Val Val Asn Ala Arg Leu Arg 



gca aaa cac tac gtc tat att ggc gat cct get caa tta cca gec ccc 
Ala Lys His Tyr Val Tyr He Gly Asp Pro Ala Gin Leu Pro Ala Pro 



cgc aca ttg ctg act aaa ggc aca eta gaa cca gaa tat ttt aat tea 
Arg Thr Leu Leu Thr Lys Gly Thr Leu Glu Pro Glu Tyr Phe Asn Ser 



gtg tgc aga ctt atg aaa aca ata ggt cca gac atg ttc ctt gga act 

Val Cys Arg Leu Met Lys Thr He Gly Pro Asp Met Phe Leu Gly Thr 

100 105 110 

tgt cgc cgt tgt cct get gaa att gtt gac act gtg agt get tta gtt 

Cys Arg Arg Cys Pro Ala Glu He Val Asp Thr Val Ser Ala Leu Val 



tat gac aat aag eta aaa gca cac aag gag aag tea get caa tgc ttc 
Tyr Asp Asn Lys Leu Lys Ala His Lys Glu Lys Ser Ala Gin Cys Phe 



aaa atg ttc tac aaa ggt gtt att aca cat gat gtt tea tct gca ate 

Lys Met Phe Tyr Lys Gly Val He Thr His Asp Val Ser Ser Ala He 

145 150 155 160 

aac aga cct caa ata ggc gtt gta aga gaa ttt ctt aca cgc aat cct 

Asn Arg Pro Gin He Gly Val Val Arg Glu Phe Leu Thr Arg Asn Pro 

165 170 175 

get tgg aga aaa get gtt ttt ate tea cct tat aat tea cag aac get 

Ala Trp Arg Lys Ala Val Phe He Ser Pro Tyr Asn Ser Gin Asn Ala 

180 185 190 

gta get tea aaa ate tta gga ttg cct acg cag act gtt gat tea tea 

Val Ala Ser Lys He Leu Gly Leu Pro Thr Gin Thr Val Asp Ser Ser 

195 200 205 

cag ggt tct gaa tat gac tat gtc ata ttc aca caa act act gaa aca 

Gin Gly Ser Glu Tyr Asp Tyr Val He Phe Thr Gin Thr Thr Glu Thr 

210 215 220 
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gca cac tot tgt aat gtc aac cgc ttc aat gtg get ate aca agg gca 721 
Ala His Ser Cys Asn Val Asn Arg Phe Asn Val Ala lie Thr Arg Ala 

225 230 235 240 

aaa att ggc att ttg tgo ata atg tct gat aga gat ctt tat gac aaa 769 
Lys He Gly He Leu Cys He Met Ser Asp Arg Asp Leu Tyr Asp Lys 
245 250 255 

ctg oaa ttt aca agt eta gaa ata cca cgt cgc aat gtg get aca tta 817 
Leu Gin Phe The Ser Leu Glu He Pro Arg Arg Asn Val Ala Thr Leu 
260 265 270 

caa gca gaa aat gta act gga ctt ttt aag gac tgt agt aag ate att 8 65 
Gin Ala Glu Asn Val Thr Gly Leu Phe Lys Asp Cys Ser Lys lie He 
275 280 285 

act ggt ctt cat cot aca cag gca cct aca cac etc age gtt gat ata 913 
Thr Gly Leu His Pro Thr Gin Ala Pro Thr His Leu Ser Val Asp He 
290 295 300 

aaa ttc aag act gaa gga tta tgt gtt gac ata cca ggc ata cca aag 961 
Lys Phe Lys Thr Glu Gly Leu Cys Val Asp He Pro Gly He Pro Lys 
305 310 315 320 

gac atg acc tac cgt aga etc ate tct atg atg ggt ttc aaa atg aat 1009 
Asp Met Thr Tyr Arg Arg Leu He Ser Met Met Gly Phe Lys Met Asn 
325 330 335 

tac caa gtc aat ggt tac cct aat atg ttt ate acc cgc gaa gaa get 1057 
Tyr Gin Val Asn Gly Tyr Pro Asn Met Phe He Thr Arg Glu Glu Ala 
340 345 350 

att cgt cac gtt cgt gcg tgg att ggc ttt gat gta gag ggc tgt cat 1105 
He Arg His Val Arg Ala Trp He Gly Phe Asp Val Glu Gly Cys His 
355 360 365 

gca act aga gat get gtg ggt act aac eta cct etc cag eta gga ttt 1153 
Ala Thr Arg Asp Ala Val Gly Thr Asn Leu Pro Leu Gin Leu Gly Phe 
370 375 380 

tct aca ggt gtt aac tta gta get gta ccg act ggt tat gtt gac act 1201 
Ser Thr Gly Val Asn Leu Val Ala Val Pro Thr Gly Tyr Val Asp Thr 
385 390 395 " 400 

gaa aat aac eta 1213 
Glu Asn Asn Leu 
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c aga acc atg cct aac atg ctt agg ata atg gcc tct ctt gtt ctt get 4 9 
Arg Thr Met Pro Asn Met Leu Arg He Met Ala Ser Leu Val Leu Ala 
15 10 IS 

cgc aaa cat aac act tgc tgt aac tta tea cac cgt ttc tac agg tta 97 
Arg Lys His Asn Thr Cys Cys Asn Leu Ser His Arg Phe Tyr Arg Leu 
20 25 30 

got aac gag tgt gcg caa gta tta agt gag atg gtc atg tgt ggc ggc 145 
Ala Asn Glu Cys Ala Gin Val Leu Ser Glu Met Val Met Cys Gly Gly 
35 40 45 

tea eta tat gtt aaa cca ggt gga aca tea tec ggt gat get aca act 193 
Ser Leu Tyr Val Lys Pro Gly Gly Thr Ser Ser Gly Asp Ala Thr Thr 
50 55 60 

get tat get aat agt gtc ttt aac att tgt caa get gtt aca gcc aat 241 
Ala Tyr Ala Asn Ser Val Phe Asn He Cys Gin Ala Val Thr Ala Asn 
65 70 75 80 

gta aat gca ctt ctt tea act gat ggt aat aag ata get gac aag tat 289 
Val Asn Ala Leu Leu Ser Thr Asp Gly Asn Lys He Ala Asp Lys Tyr 
85 90 95 

gtc cgc aat eta eaa cac agg etc tat gag tgt etc tat aga aat agg 337 
Val Arg Asn Leu Gin His Arg Leu Tyr Glu Cys Leu Tyr Arg Asn Arg 
100 105 110 

gat gtt gat cat gaa ttc gtg gat gag ttt tac get tac ctg cgt aaa 385 
Asp Val Asp His Glu Phe Val Asp Glu Phe Tyr Ala Tyr Leu Arg Lys 
115 120 125 

eat ttc tec atg atg att ctt tot gat gat gcc gtt gtg tgc tat aac 433 
His Phe Ser Met Met He Leu Ser Asp Asp Ala Val Val Cys Tyr Asn 
130 135 140 

agt aae tat gcg get caa ggt tta gta get age att aag aac ttt aag 481 
Ser Asn Tyr Ala Ala Gin Gly Leu Val Ala Ser He Lys Asn Phe Lys 
145 150 155 160 

gca gtt ctt tat tat caa aat aat gtg ttc atg tct gag gca aaa tgt 529 
Ala Val Leu Tyr Tyr Gin Asn Asn val Phe Met Ser Glu Ala Lys Cys 
165 170 S 175 

tgg act gag act gac ctt act aaa gga cct cac gaa ttt tgc tea cag 577 
Trp Thr Glu Thr Asp Leu Thr Lys Gly Pro His Glu Phe Cys Ser Gin 
180 185 190 

cat aca atg eta gtt aaa caa gga gat gat tac gtg tac ctg cot tac 625 
His Thr Met Leu Val Lys Gin Gly Asp Asp Tyr Val Tyr Leu Pro Tyr 
195 200 205 

cca gat cca tea aga ata tta ggc gca ggc tgt ttt gto gat gat att 573 
Pro Asp Pro Ser Arg He Leu Gly Ala Gly Cys Phe Val Asp Asp He 
210 215 220 

gtc aaa cag atg gta cac tta tga ttg aaa ggt tec gtg tea ctg get 721 
Val Lys Gin Met Val His Leu 
225 230 
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1 atattaggtt tttacctacc caggaaaagc caaccaacct cgatctcttg tagatctgtt 
61 ctctaaacga actttaaaat ctgtgtagct gtcgctcggc tgcatgccta gtgcacctac 
121 gcagtataaa caataataaa ttttactgto gttgaoaaga aaogagtaac tcgtccctct 
181 tctgcagact gcttacggtt tcgtccgtgt tgcagtcgat catcagcata cctaggtttc 
241 gtccgggtgt gaccgaaagg taagatggag agccttgttc ttggtgtcaa cgagaaaaca 
301 cacgtccaac tcagtttgcc tgtccttcag gttagagacg tgctagtgcg tggcttcggg 
361 gaotctgtgg aagaggccct atoggaggca cgtgaacacc tcaaaaatgg cacttgtggt 
421 ctagtagagc tggaaaaagg cgtactgccc cagcttgaac agccctatgt gttcattaaa 
481 cgttctgatg ccttaagcac caatcacggc cacaaggtcg ttgagctggt tgcagaaatg 
541 gacggcattc agtacggtcg tagcggtata acactgggag tactcgtgco acatgtgggc 
601 gaaaccocaa ttgcataccg caatgttctt cttcgtaaga acggtaataa gggagccggt 
661 ggtcatagct atggcatcga tctaaagtct tatgacttag gtgacgagct tggcactgat 
121 cccattgaag attatgaaca aaactggaac actaagcatg gcagtggtgc actccgtgaa 
781 ctcactcgtg agctcaatgg aggtgcagtc aotcgctatg tcgacaacaa tttctgtggc 
841 ccagatgggt accctcttga ttgcatcaaa gattttctcg cacgcgcggg caagtcaatg 
901 tgcactcttt ccgaacaact tgattacatc gagtcgaaga gaggtgtcta ctgctgccgt 
961 gaccatgagc atgaaattgc ctggttcact gagcgctctg ataagagcta cgagcaccag 
1021 acacccttcg aaattaagag tgccaagaaa tttgacactt tcaaagggga atgcccaaag 
1081 tttgtgtttc otcttaactc aaaagtcaaa gtcattcaac cacgtgttga aaagaaaaag 
1141 actgagggtt tcatggggcg tatacgctct gtgtaccctg ttgcatctcc acaggagtgt 
1201 aacaatatgc acttgtctac cttgatgaaa tgtaatcatt gcgatgaagt ttcatggcag 
1261 acgtgcgact ttctgaaagc cacttgtgaa cattgtggca ctgaaaattt agttattgaa 
1321 ggacctacta catgtgggta cctacctact aatgctgtag tgaaaatgcc atgtcctgoc 
1381 tgtcaagacc cagagattgg acctgagcat agtgttgcag attatcacaa ccactcaaac 
1441 attgaaaotc gactccgcaa gggaggtagg actagatgtt ttggaggctg tgtgtttgcc 
1501 tatgttggct gctataataa gcgtgcctac tgggttcctc gtgctagtgc tgatattggc 
1561 tcaggccata ctggcattac tggtgacaat gtggagacct tgaatgagga tctccttgag 
1621 atactgagtc gtgaacgtgt taacattaac attgttggcg attttcattt gaatgaagag 
1681 gttgocatca ttttggcatc tttctctgct tctacaagtg cctttattga cactataaag 
1741 agtcttgatt acaagtcttt caaaaccatt gttgagtcct gcggtaacta taaagttaco 
1801 aagggaaagc ccgtaaaagg tgcttggaac attggacaac agagatoagt tttaacacca 
1861 ctgtgtggtt ttccctoaca ggctgctggt gttatcagat caatttttgo gcgcacactt 
1921 gatgcagcaa accactcaat tcctgatttg caaagagcag ctgtcaccat acttgatggt 
1981 atttctgaac agtcattacg tcttgtcgac gccatggttt atacttcaga cctgctcacc 
2041 aacagtgtca ttattatggc atatgtaact ggtggtcttg tacaacagac ttctcagtgg 
2101 ttgtctaatc ttttgggcac tactgttgaa aaactcaggc ctatctttga atggattgag 
2161 gcgaaactta gtgcaggagt tgaatttctc aaggatgctt gggagattct caaatttctc 
-^2221 attacaggtg tttttgacat cgtcaagggt caaatacagg ttgcttoaga taacatcaag 
2281 gattgtgtaa aatgcttcat tgatgttgtt aacaaggcac tcgaaatgtg cattgatcaa 
2341 gtcactatcg ctggcgcaaa gttgcgatca ctcaacttag gtgaagtctt catcgctcaa 
2401 agcaagggac tttaccgtca gtgtatacgt ggcaaggagc agctgcaact actcatgcct 
24 61 cttaaggcac caaaagaagt aacctttctt gaaggtgatt cacatgacac agtacttacc 
2521 tctgaggagg ttgttctcaa gaacggtgaa ctcgaagcac tcgagacgcc cgttgatagc 
2581 ttcacaaatg gagctatcgt cggcacacca gtctgtgtaa atggcctcat gctcttagag 
2 641 attaaggaca aagaacaata ctgcgcattg tctcctggtt tactggctac aaacaatgtc 
2 7 01 tttcgcttaa aagggggtgc accaattaaa ggtgtaacct ttggagaaga tactgtttgg 
2761 gaagttcaag gttacaagaa tgtgagaatc acatttgagc ttgatgaacg tgttgacaaa 
2821 gtgcttaatg aaaagtgctc tgtctacact gttgaatccg gtaccgaagt tactgagttt 
2 881 gcatgtgttg tagcagaggc tgttgtgaag actttacaac cagtttctga tctccttaoc 
2941 aacatgggta ttgatcttga tgagtggagt gtagctacat tcfcacttatt tgatgatgct 
3001 ggtgaagaaa acttttcatc acgtatgtat tgttcctttt accctccaga tgaggaagaa 
3061 gaggacgatg cagagtgtga ggaagaagaa attgatgaaa cctgtgaaca tgagtacggt 
3121 acagaggatg attatcaagg tctccctctg gaatttggtg cctcagctga aacagttcga 
3181 gttgaggaag aagaagagga agactggctg gatgatacta ctgagcaatc agagattgag 
3241 ccagaaccag aacctacacc tgaagaacca gttaatcagt ttactggtta tttaaaactt 
33 01 actgacaatg ttgccattaa atgtgttgac atcgttaagg aggcacaaag tgotaatcct 
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33 61 atggtgattg taaatgctgc taacatacac ctgaaacatg gtggtggtgt agcaggtgca 
3421 ctcaacaagg caaccaatgg tgccatgcaa aaggagagtg atgattacat taagctaaat 
3481 ggccctctta cagtaggagg gtcttgtttg ctttctggac ataatcttgc taagaagtgt 
3541 ctgcatgttg ttggacctaa cctaaatgca ggtgaggaca tccagcttct taaggcagca 
3601 tatgaaaatt tcaattcaca ggacatctta cttgcaccat tgttgtcagc aggcatattt 
3661 ggtgctaaac cacttcagtc tttacaagtg tgcgtgcaga cggttcgtao aoaggtttat 
3721 attgcagtca atgacaaagc tctttatgag caggttgtca tggattatct tgataacctg 
37 81 aagcctagag tggaagcaoc taaacaagag gagccaccaa acacagaaga ttccaaaact 
3841 gaggagaaat ctgtcgtaca gaagcctgtc gatgfcgaagc caaaaattaa ggcctgcatt 
3901 gatgaggtta ccacaacact ggaagaaact aagtttctta ccaataagtt actcttgttt 
3961 gctgatatca atggtaagct ttaccatgat tctcagaaca tgcttagagg tgaagatatg 
4021 tctttccttg agaaggatgc accttacatg gtaggtgatg ttatcactag tggtgatatc 
4081 acttgtgttg taataccctc caaaaaggct ggtggcaota ctgagatgct ctcaagagct 
4141 ttgaagaaag tgccagttga tgagtatata accacgtacc ctggacaagg atgtgctggt 
4201 tatacacttg aggaagctaa gactgctctt aagaaatgca aatctgcatt ttatgtacta 
42 61 ccttcagaag cacctaatgc taaggaagag attctaggaa ctgtatcctg gaatttgaga 
4321 gaaatgcttg ctcatgctga agagacaaga aaattaatgc ctatatgcat ggatgttaga 
4381 gccataatgg caaccatcca acgtaagtat aaaggaatta aaattcaaga gggcatcgtt 
44 41 gactatggtg tccgattctt cttttataot agtaaagagc ctgtagcttc tattattacg 
4501 aagctgaact ctctaaatga gccgcttgtc acaatgccaa ttggttatgt gacacatggt 
4561 tttaatcttg aagaggctgc gcgctgtatg cgttctctta aagctcctgc cgtagtgtca 
4621 gtatcatcac cagatgctgt tactacatat aatggatacc tcacttcgtc atcaaagaca 
4 681 tctgaggagc actttgtaga aacagtttct ttggctggct cttacagaga ttggtcctat 
4741 tcaggacagc gtacagagtt agctgttgaa tttcttaagc gtggtgacaa aattgtgtao 
4801 cacactctgg agagccccgt cgagtttcat cttgacggtg aggttctttc acttgacaaa 
48 61 ctaaagagtc tcttatccct gcgggaggtt aagactataa aagtgttcac aactgtggac 
4921 aacactaatc tocacacaoa gcttgtggat atgtctatga catatggaca gcagtttggt 
4981 ccaacatact tggatggtgc tgatgttaca aaaattaaac ctcatgtaaa tcatgagggt 
5041 aagactttct ttgtactacc tagtgatgac acactacgta gtgaagcttt cgagtactac 
5101 catactcttg atgagagttt tcttggtagg tacatgtctg otttaaacca oacaaagaaa 
5161 tggaaatttc ctcaagttgg tggtttaaot toaattaaat gggctgataa caattgttat 
5221 ttgtctagtg ttttattago aottcaacag cttgaagtca aattcaatgc accagoactt 
5281 caagaggctt attatagagc ccgtgctggt gatgctgcta acttttgtgo actcatactc 
5341 gcttaoagta ataaaactgt tggcgagctt ggtgatgtca gagaaactat gacccatctt 
5401 ctacagcatg ctaatttgga atctgcaaag cgagttctta atgtggtgtg taaacattgt 
5461 ggtcagaaaa ctactacctt aacgggtgta gaagctgtga tgtatatggg tactctatct 
5521 tatgataato ttaagacagg tgtttccatt ccatgtgtgt gtggtcgtga tgctaoacaa 
5581 tatctagtac aacaagagtc ttcttttgtt atgatgtctg caccacctgc tgagtataaa 
5641 ttacagcaag gtacattctt atgtgcgaat gagtacactg gtaactatca gtgtggtcat 
5701 tacaotcata taactgctaa ggagaccctc tatcgtattg acggagctca ccttacaaag 
5761 atgtcagagt acaaaggacc agtgactgat gttttctaca aggaaacatc ttacactaca 
5821 accafccaagc ctgtgtcgta taaactcgat ggagttactt acacagagat tgaaccaaaa 
5881 ttggatgggt attataaaaa ggataatgct tactatacag agoagcctat agaccttgta 
5941 ccaactcaac cattaccaaa tgcgagtttt gataatttca aactcacatg ttctaacaca 
6001 aaatttgctg atgatttaaa tcaaatgaca ggcttcacaa agccagcttc acgagagcta 
6061 tctgtcacat tcttcccaga cttgaatggc gatgtagtgg ctattgacta tagacactat 
6121 tcagcgagtt tcaagaaagg tgctaaatta ctgcataagc caattgtttg goacattaac 
6181 caggctacaa ccaagacaac gttcaaacca aacacttggt gtttacgttg tctttggagt 
6241 acaaagccag tagatacttc aaattcattt gaagttctgg cagtagaaga cacacaagga 
6301 atggacaatc ttgcttgtga aagtcaacaa cccacctctg aagaagtagt ggaaaatcct 
6361 accatacaga aggaagtcat agagtgtgac gtgaaaacta ccgaagttgt aggcaatgtc 
6421 atacttaaac catcagatga aggtgttaaa gtaacacaag agttaggtca tgaggatctt 
6481 atggctgctt atgtggaaaa cacaagcatt accattaaga aacctaatga gctttcacta 
6541 gccttaggtt taaaaacaat tgccactcat ggtattgctg caattaatag tgttccttgg 
6601 agtaaaattt tggcttatgt caaaccattc ttaggaoaag cagcaattao aacatcaaat 
6661 tgcgctaaga gattagcaca acgtgtgttt aacaattata tgccttatgt gtttacatta 
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6721 ttgttccaat tgtgtacttt tactaaaagt accaattcta gaattagagc ttcactacot 
6781 acaactattg ctaaaaatag tgttaagagt gttgctaaat tatgtttgga tgccggcatt 
68 41 aattatgtga agtcacccaa attttctaaa ttgttcacaa tcgctatgtg gctattgttg 
6901 ttaagtattt gcttaggttc tctaatctgt gtaactgctg cttttggtgt actcttatct 
6961 aattttggtg ctccttctta ttgtaatggc gttagagaat tgtatcttaa ttcgtctaac 
7021 gttactacta tggatttctg tgaaggttct tttccttgca gcatttgttt aagtggatta 
7081 gactcccttg attcttatcc agctcttgaa accattcagg tgacgatttc atcgtacaag 
7141 ctagacttga caattttagg tctggccgct gagtgggttt tggcatatat gttgttcaca 
7201 aaattctttt atttattagg tctttcagct ataatgcagg tgttctttgg ctattttgct 
72 61 agtcatttca tcagcaattc ttggctcatg tggtttatca ttagtattgt acaaatggca 
7321 cccgtttctg caatggttag gatgtacatc ttctttgctt ctttctacta catatggaag 
7381 agctatgttc atatcatgga tggttgcacc tcttcgactt gcatgatgtg ctataagcgc 
74 41 aatcgtgcca cacgcgttga gtgtacaact attgttaatg gcatgaagag atctttctat 
7501 gtctatgcaa atggaggccg tggcttctgc aagactcaca attggaattg tctoaattgt 
7561 gacacatttt gcactggtag tacattcatt agtgatgaag ttgctcgtga tttgtcactc 
7621 cagtttaaaa gaccaatcaa ccctactgac cagtcatcgt atattgttga tagtgttgct 
7681 gtgaaaaatg gcgcgcttca cctctacttt gacaaggctg gtcaaaagac ctatgagaga 
77 41 catccgotct cccattttgt caatttagac aatttgagag ctaacaacac taaaggttca 
7801 ctgcctatta atgtcatagt ttttgatggc aagtccaaat gcgacgagtc tgcttotaag 
7861 tctgcttctg tgtactacag tcagctgatg tgccaaccta ttctgttgct tgaccaagct 
7921 cttgtatcaa acgttggaga tagtactgaa gtttccgtta agatgtttga tgcttatgtc 
7981 gacacctttt cagcaacttt tagtgttcct atggaaaaac ttaaggcact tgttgctaca 
8041 gctcacagcg agttagcaaa gggtgtagct ttagatggtg tcctttctac attcgtgtca 
3101 gctgcccgac aaggtgttgt tgataccgat gttgacacaa aggatgttat tgaatgtctc 
8161 aaactttcac atcactctga cttagaagtg acaggtgaca gttgtaacaa tttcatgctc 
8221 acctataata aggttgaaaa catgacgccc agagatcttg gcgcatgtat tgactgtaat 
8281 gcaaggcata tcaatgccca agtagcaaaa agtcacaatg tttcactcat ctggaatgta 
8341 aaagactaoa tgtctttatc tgaacagctg cgtaaacaaa ttcgtactgc tgccaagaag 
8401 aacaacatac cttttacact aaottgtgot acaactagac aggttgtcaa tgtcataact 
84 61 actaaaatct cactcaaggg tggtaagatt gttagtactt gttttaaact tatgcttaag 
8521 gcoacattat tgtgcgttct tgctgcattg gtttgttata tcgttatgcc agtacataca 
8581 ttgtcaatcc atgatggtta cacaaatgaa atcattggtt acaaagccat tcaggatggt 
8641 gtcactcgtg acatcatttc tactgatgat tgttttgcaa ataaacatgc tggttttgac 
8701 gcatggttta gccagcgtgg tggttcatac aaaaatgaca aaagctgccc tgtagtagct 
8761 gctatcatta caagagagat tggtttcata gtgcctggct taccgggtac tgtgctgaga 
8821 gcaatcaatg gtgacttctt gcattttcta cotcgtgttt ttagtgctgt tggcaacatt 
8881 tgctacacao cttccaaact cattgagtat agtgattttg ctacctctgc ttgcgttctt 
8941 gctgctgagt gtacaatttt taaggatgct atgggcaaac ctgtgccata ttgttatgac 
9001 actaatttgc tagagggttc tatttcttat agtgagcttc gtccagacaQ tcgttatgtg 
9061 cttatggatg gttccatcat acagtttcct aacacttacc tggagggttc tgfctagagta 
9121 gtaacaactt ttgatgctga gtactgtaga catggtacat gcgaaaggto agaagtaggt 
9181 atttgcctat ctaccagtgg tagatgggtt cfctaataatg agoattacag agctctatca 
9241 ggagttttct gtggtgttga tgcgatgaat ctcatagcta acatctttac tcctcttgtg 
9301 caacctgtgg gtgctttaga tgtgtctgct tcagtagtgg ctggtggtat tattgccata 
9361 ttggtgactt gtgctgccta ctactttatg aaattcagac gtgtttttgg tgagtacaac 
9421 catgttgttg ctgctaatgc acttttgttt ttgatgtctt tcactatact ctgtctggta 
9481 ccagcttaca gctttctgcc gggagtctac tcagtctttt acttgtactt gacattctat 
9541 ttcaccaatg atgtttcatt cttggctcac cttcaatggt ttgccatgtt ttctcctatt 
9601 gtgccttttt ggataacagc aatctatgta ttctgtattt ctctgaagca ctgccattgg 
9661 ttctttaaca actatcttag gaaaagagtc atgtttaatg gagttacatt tagtaccttc 
9721 gaggaggctg ctttgtgtac ctttttgctc aacaaggaaa tgtacctaaa attgcgtagc 
9781 gagacactgt tgccacttac acagtataac aggtatcttg ctctatataa caagtacaag 
9841 tatttcagtg gagccttaga tactaccagc tatcgtgaag cagcttgctg ccacttagoa 
9901 aaggctctaa atgactttag caactcaggt gctgatgttc tctaccaacc accacagaca 
9961 tcaatcaott ctgctgttct gcagagtggt tttaggaaaa tggcattccc gtcaggcaaa 
10021 gttgaagggt gcatggtaca agtaacctgt ggaactacaa ctcttaatgg attgtggttg 
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10081 gatgacacag tatactgtcc aagacatgtc atttgcacag cagaagacat gcttaatcct 
10141 aactatgaag atctgctcat tcgcaaatcc aaccatagct ttcttgttca ggctggcaat 
10201 gttcaacttc gtgttattgg ccattctatg caaaattgtc tgcttaggct taaagttgat 
10261 acttctaacc ctaagacacc caagtataaa tttgtccgta tccaacctgg tcaaacattt 
10321 tcagtcctag catgctacaa tggttcacca tctggtgttt atcagtgtgc catgagacct 
10381 aatcatacca ttaaaggttc tttccttaat ggatcatgtg gtagtgttgg ttttaacatt 
10441 gattatgatt gcgtgtcttt ctgctatatg catcatatgg agcttccaac aggagtacac 
10501 gctggtactg acttagaagg taaattctat ggtccatttg ttgacagaca aactgcacag 
10561 gctgcaggta cagacacaac cataacatta aatgttttgg catggctgta tgctgctgtt 
10621 atcaatggtg ataggtggtt tcttaataga ttcaccacta ctttgaatga ctttaacctt 
10681 gtggcaatga agtacaacta tgaacctttg acacaagatc atgttgacat attgggaoot 
10741 -ctttctgctc aaacaggaat tgccgtctta gatatgtgtg ctgctttgaa agagctgctg 
10801 cagaatggta tgaatggtcg tactatcctt ggtagcacta ttttagaaga tgagtttaca 
109 61 ccatttgatg ttgttagaca atgctctggt gttaccttcc aaggtaagtt caagaaaatt 
10921 gttaagggca ctcatcattg gatgctttta actttcttga catcactatt gattcttgtt 
10981 caaagtacac agtggtcact gtttttcttt gtttacgaga atgctttctt gccatttact 
11041 cttggtatta tggcaattgc tgcatgtgct atgctgcttg ttaagcataa gcacgcattc 
11101 ttgtgcttgt ttctgttacc ttctcttgca acagttgctt actttaatat ggtctacatg 
11161 cctgctagct gggtgatgcg tatcatgaca tggcttgaat tggctgacac tagcttgtct 
11221 ggttataggc ttaaggattg tgttatgtat gcttcagctt tagttttgct tattctcatg 
11281 acagctcgca ctgtttatga tgatgctgct agacgtgttt ggacactgat gaatgtcatt 
11341 acacttgttt acaaagtcta ctatggtaat gctttagatc aagctatttc catgtgggcc 
11401 ttagttattt otgtaaoctc taactattot ggtgtcgtta cgactatcat gtttttagct 
11461 agagctatag tgtttgtgtg tgttgagtat tacccattgt tatttattac tggcaacacc 
11521 ttacagtgta tcatgcttgt ttattgtttc ttaggctatt gttgctgctg ctactttggc 
11581 cttttctgtt tacncaaccg ttacttcagg cttactcttg gtgtttatga ctacttggtc 
11641 tctacacaag aatttaggta tatgaactcc caggggcttt tgcctcctaa gagtagtatt 
11701 gatgctttca agcttaacat taagttgttg ggtattggag gtaaaccatg tatcaaggtt 
11761 gctactgtac agtctaaaat gtctgacgta aagtgcacat ctgtggtact gctctcggtt 
11821 cttcaacaac ttagagtaga gtcatottot aaattgtggg cacaatgtgt acaactccac 
11881 aatgatattc ttcttgcaaa agacacaact gaagctttcg agaagatggt ttctcttttg 
11941 tctgttttgo tatccatgoa gggtgctgta gacattaata ggttgtgcga ggaaatgctc 
12001 gataaccgtg ctactcttca ggctattgct tcagaattta gttctttaoc atcatatgoc 
12061 gcttatgcca ctgcccagga ggcctatgag caggctgtag ctaatggtga ttctgaagtc 
12121 gttctcaaaa agttaaagaa atctttgaat gtggctaaat ctgagtttga ccgtgatgct 
12181 gccatgcaac gcaagttgga aaagatggoa gatcaggcta tgacccaaat gtacaaacag 
122 41 gcaagatctg aggacaagag ggcaaaagta actagtgcta tgcaaacaat gctcttcact 
12301 atgcttagga agcttgataa tgatgcactt aacaacatta tcaacaatgo gogtgatggt 
12361 tgtgttccac tcaacatcat accattgact acagcagcca aactcatggt tgttgtccct 
12421 gattatggta cctacaagaa cacttgtgat ggtaacacct ttacatatgc atctgcactc 
12481 tgggaaatcc agcaagttgt tgatgcggat agcaagattg ttcaacttag tgaaattaac 
12541 atggacaatt caccaaattt ggcttggcct cttattgtta cagctctaag agccaactca 
12601 gctgttaaac tacagaataa tgaactgagt ccagtagcac tacgacagat gtcctgtgcg 
12661 gctggtacca cacaaacagc ttgtactgat gacaatgcac ttgcctacta taacaattcg 
12721 aagggaggta ggtttgtgct ggcattacta tcagaccacc aagatctcaa atgggctaga 
12781 ttccctaaga gtgatggtac aggtacaatt tacacagaac tggaaccacc ttgtaggttt 
12841 gttacagaca caccaaaagg gcctaaagtg aaatacttgt acttcatcaa aggcttaaac 
12 901 aacctaaata gaggtatggt gctgggcagt ttagctgcta cagtacgtct tcaggotgga 
12961 aatgctacag aagtacctgc caattcaact gtgctttcct tctgtgottt tgcagtagac 
13021 cotgctaaag catataagga ttacctagca agtggaggac aaccaatcac caactgtgtg 
13081 aagatgttgt gtacacacac tggtagagga caggcaatta ctgtaacacc agaagotaac 
13141 atggaccaag agtcctctgg tggtgcttca tgttgtctgt attgtagatg ccacattgac 
13201 catcoaaatc ctaaaggatt ctgtgactfcg aaaggtaagt acgtccaaat acctaccact 
132 61 tgtgctaatg acccagtggg ttttacactt agaaacacag tctgtaccgt ctgcggaatg 
13321 tggaaaggtt atggctgtag ttgtgaccaa ctcogcgaac ccttgatgca gtctgcggat 
13381 gcatcaacgt ttttaaacgg gtttgcggtg taagtgcagc ccgtcttaca ccgtgcggca 
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13441 caggcactag tactgatgtc gtctacaggg cttttgatat ttacaacgaa aaaagtgctg 
13501 gttttgcaaa gttcctaaaa actaattgct gtcgcttcca ggagaaggat gaggaaggca 
13561 atttattaga ctcttacttt gtagttaaga ggcatactat gtctaactac caacatgaag 
U621 agaotattta taacttggtt aaagattgtc cagcggttgc tgtccatgac tttttcaagt 
13681 ttagagtaga tggtgacatg gtaccacata tatcacgtca gcgtctaact aaatacacaa 
13741 tggctgattt agtctatgct ctacgtcatt ttgatgaggg taattgtgat aoattaaaag 
13801 aaatactcgt cacatacaat tgctgtgatg atgattattt caataagaag gattggtatg 
138 61 acttcgtaga gaatcctgac atcttacgcg tatatgctaa cttaggtgag cgtgtacgcc 
13921 aatcattatt aaagactgta caattctgcg atgctatgcg tgatgcaggo attgtaggcg 
13981 tactgacatt agataatcag gatcttaatg ggaactggta cgatttcggt gatttcgtac 
14041 aagtagcacc aggctgcgga gttcctattg tggattcata ttactcattg ctgatgccca 
14101 tcctcacttt gactagggca ttggctgctg agtcccatat ggatgctgat ctcgcaaaac 
14161 cacttattaa gtgggatttg ctgaaatatg attttacgga agagagactt tgtctcttcg 
14221 accgttattt taaatattgg gaccagacat accatcccaa ttgtattaac tgtttggatg 
14281 ataggtgtat ccttcattgt gcaaacttta atgtgttatt ttctactgtg tttccaccta 
14341 caagttttgg accactagta agaaaaatat ttgtagatgg tgttcctttt gttgtttcaa 
14401 ctggatacca ttttcgtgag ttaggagtcg tacataatca ggatgtaaac ttacatagct 
14461 cgcgtctcag tttcaaggaa cttttagtgt atgctgctga tccagotatg catgcagctt 
14521 ctggcaattt attgctagat aaacgoacta catgcttttc agtagctgca ctaacaaaca 
14581 atgttgcttt tcaaactgtc aaacccggta attttaataa agacttttat gactttgctg 
14641 tgtctaaagg tttctttaag gaaggaagtt ctgttgaact aaaacacttc ttctttgctc 
14701 aggatggcaa cgctgctatc agtgattatg actattatcg ttataatctg ccaaoaatgt 
14761 gtgatatcag acaactccta ttcgtagttg aagttgttga taaatacttt gattgttacg 
14821 atggtggctg tattaatgcc aaccaagtaa tcgttaaoaa tctggataaa tcagctggtt 
14881 tcccatttaa taaatggggt aaggctagac tttattatga ctcaatgagt tatgaggatc 
14941 aagatgcact tttcgcgtat actaagcgta atgtcatccc tactataact caaatgaatc 
15001 thaagtatgc cattagtgca aagaatagag otcgcaccgt agctggtgtc tctatctgta 
15061 gtactatgac aaatagacag tttcatcaga aattattgaa gtcaatagcc gccactagag 
15121 gagctactgt ggtaattgga acaagcaagt tttacggtgg ctggcataat atgttaaaaa 
15181 ctgtttacag tgatgtagaa actccacaco ttatgggttg ggattatcoa aaatgtgaoa 
15241 gagcoatgcc taaoatgott aggataatgg cctctcttgt tcttgctcgc aaacataaaa 
15301 cttgctgtaa cttatcacac cgtttctaca ggttagctaa cgagtgtgcg caagtattaa 
15361 gtgagatggt catgtgtggc ggctcactat atgttaaacc aggtggaaca tcatccggtg 
154 21 atgctacaac tgcttatgct aatagtgtct ttaacatttg tcaagctgtt acagccaatg 
154 81 taaatgcact tctttoaact gatggtaata agatagctga caagtatgtc cgcaatotac 
15541 aacacaggct ctatgagtgt ctctatagaa atagggatgt tgatcatgaa ttcgtggatg 
15601 agttttacgc ttacctgcgt aaacatttct ccatgatgat tctttctgat gatgccgttg 
15661 tgtgctataa cagtaactat gcggctcaag gtttagtagc tagcattaag aactttaagg 
15721 cagttcttta ttatcaaaat aatgtgttca tgtctgaggc aaaatgttgg actgagactg 

157 81 accttactaa aggacotcac gaattttgct cacagcatac aatgctagtt aaacaaggag 

158 41 atgattacgt gtacctgcct tacccagatc catcaagaat attaggcgca ggctgttttg 
15901 tcgatgatat tgtcaaaaca gatggtacac ttatgattga aaggttcgtg tcactggcta 
15961 ttgatgctta cccacttaca aaacatccta atcaggagta tgctgatgtc tttcacttgt 
16021 atttacaata cattagaaag ttacatgatg agcttaotgg ccacatgttg gacatgtatt 
16081 ccgtaatgct aactaatgat aacacctcac ggtactggga aoctgagttt tatgaggcta 
16141 tgtacacacc acatacagtc ttgcaggctg taggtgcttg tgtattgtgc aattcacaga 
162 01 cttcacttcg ttgcggtgcc tgtattagga gaccattcct atgttgcaag tgctgctatg 



^ v.^ 3 wv ); ,^ a v-^. i-y v-oui-ayya ja^a i- uuu l a Ly l uycaag tgcugcta zg 

152 61 accatgtcat ttcaacatca cacaaattag tgttgtctgt taatccctat gtttgcaatg 
16321 ccccaggttg tgatgtcact gatgtgacac aactgtatct aggaggtatg agctattatt 
153 81 gcaagtcaca taagcctccc attagttttc cattatgtgc taatggtcag gtttttggtt 
16441 tatacaaaaa cacatgtgta ggcagtgaca atgtcactga ottcaatgcg atagcaacat 
16501 gtgattggac taatgctggc gattacatac ttgccaacac ttgtactgag agactcaagc 
165 61 ttttcgcagc agaaacgctc aaagccactg aggaaacatt taagctgtca tatggtattg 
16621 ccactgtacg cgaagtactc tctgacagag aattgcatct ttcatgggag gttggaaaac 
16681 ctagaccacc attgaacaga aactatgtct ttactggtta ccgtgtaact aaaaatagta 
16741 aagtacagat tggagagtac acctttgaaa aaggtgacta tggtgatgct gttgtgtaca 
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16801 gaggtactac gacatacaag ttgaatgttg gtgattactt tgtgttgaca tctcacactg 
16861 taatgccact tagtgcacct actctagtgc cacaagagca ctatgtgaga attactggct 
16921 tgtacccaac actcaacatc tcagatgagt tttctagcaa tgttgcaaat tatcaaaagg 
16981 tcggcatgca aaagtactct acactccaag gaccacotgg tactggtaag agtcattttg 
17041 ccatcggact tgctctctat tacccatctg ctcgcatagt gtatacggca tgctctcatg 
17101 cagctgttga tgccctatgt gaaaaggcat taaaatattt gcccatagat aaatgtagta 
17161 gaatcatacc tgcgcgtgcg cgcgtagagt gttttgataa attcaaagtg aattcaacac 
17221 tagaacagta tgttttctgc actgtaaatg cattgccaga aacaactgot gacattgtag 
172 81 tctttgatga aatctctatg gctactaatt atgacttgag tgttgtcaat gctagacttc 
17341 gtgcaaaaca ctacgtctat attggcgatc ctgctcaatt accagccccc cgcacattgc 
174 01 tgactaaagg cacactagaa ccagaatatt ttaattcagt gtgcagactt atgaaaacaa 
174 61 taggtccaga catgttcctt ggaacttgtc gccgttgtcc tgctgaaatt gttgacactg 
17521 tgagtgcttt agtttatgac aataagctaa aagcacacaa ggataagtca gctcaatgct 
17581 tcaaaatgtt ctacaaaggt gttattacac atgatgtttc atctgcaatc aacagacctc 
17641 aaataggcgt tgtaagagaa tttcttaoac gcaatcctgc ttggagaaaa gctgttttta 
17701 tctcacctta taattcacag aacgctgtag cttcaaaaat cttaggattg cctacgcaga 
177 51 ctgttgattc atcacagggt tctgaatatg actatgtcat attcacacaa actactgaaa 
17821 cagcacactc ttgtaatgtc aaccgcttca atgtggctat cacaagggca aaaattggca 
17881 ttttgtgcat aatgtctgat agagatcttt atgacaaact gcaatttaca agtctagaaa 
17941 taccacgtcg caabgtggct acattacaag cagaaaatgt aactggactt tttaaggact 
16001 gtagtaagat cattactggt cttcatccta cacaggcacc taoacacctc agcgttgata 
18061 taaaattcaa gactgaagga ttatgtgttg acataccagg cataccaaag gacatgacct 
18121 accgtagact catctctatg atgggtttca aaatgaatta ccaagtcaat ggttacccta 
18181 atatgtttat cacccgcgaa gaagctattc gtcacgttcg tgcgtggatt ggctttgatg 
18241 tagagggctg tcatgcaact agagatgctg tgggtactaa cctacctctc cagctaggat 
18301 tttctacagg tgttaactta gtagctgtac cgactggtta tgttgacact gaaaataaca 
18361 cagaattcac cagagttaat gcaaaacctc caccaggtga ccagfcttaaa catcttatac 
18421 cactcatgta taaaggcttg ccotggaatg tagtgcgtat taagatagta caaatgotca 
18481 gtgatacact gaaaggattg tcagacagag tcgtgttcgt cctttgggcg catggctttg 
18541 agcttacatc aatgaagtac tttgtcaaga ttggaoctga aagaacgtgt tgtctgtgtg 
18 601 acaaacgtgc aacttgcttt tctacttcat cagataotta tgcotgotgg aatcattctg 
18661 tgggttttga ctatgtctat aacccattta tgattgatgt tcagcagtgg ggctttacgg 
18721 gtaaccttca gagtaaccat gaccaacatt gocaggtaca tggaaatgca catgtggcta 
18781 gttgtgatgo tatcatgact agatgtttag cagtccatga gtgctttgtt aagcgcgttg 
18841 attggtctgt tgaataccct attataggag atgaactgag ggttaattct gcttgcagaa 
18901 aagtacaaca catggttgtg aagtctgcat tgcttgctga taagtttcca gttcttcatg 
18 961 acattggaaa tccaaaggct atcaagtgtg tgcctcaggc tgaagtagaa tggaagttct 
19021 acgatgctca gccatgtagt gacaaagctt acaaaataga ggaactcttc tattcttatg 
19081 ctacacatca cgataaattc actgatggtg tttgtttgtt ttggaattgt aacgttgatc 
19141 gttacccagc caatgcaatt gtgtgtaggt ttgacacaag agtcttgtca aacttgaact 
19201 taccaggctg tgatggtggt agtttgtatg tgaataagca tgcattocac actocagctt 
19261 tcgataaaag tgcatttact aatttaaagc aattgccttt cttttactat tctgatagtc 
19321 cttgtgagtc tcatggcaaa caagtagtgt cggatattga ttatgttcca ctcaaatcbg 
19381 ctacgtgtat tacacgatgc aatttaggtg gtgctgtttg cagacaccat gcaaatgagt 
19441 accgacagta cttggatgca tataatatga tgatttctgc tggattbagc ctatggattt 
19501 acaaacaatt tgatacttat aacctgtgga atacatttac caggttacag agtttagaaa 
19561 atgtggctta taatgttgtt aataaaggac actttgatgg acacgccggc gaagcacctg 
19621 tttccatcat taataatgct gtttacacaa aggtagatgg tattgatgtg gagatctttg 
19681 aaaataagac aacacttcct gttaatgttg catttgagct ttgggctaag cgtaacatta 
19741 aaocagtgcc agagattaag atactcaata atttgggtgt tgatatcgct gctaatactg 
19801 taatctggga ctacaaaaga gaagccccag cacatgtatc tacaataggt gtctgcacaa 
198 61 tgactgacat tgccaagaaa cctactgaga gtgcttgttc ttcacttact gtcttgtttg 
19921 atggtagagt ggaaggacag gtagaccttt ttagaaacgc ccgtaatggt gttttaataa 
19981 cagaaggttc agtcaaaggt ctaacacctt caaagggacc agcacaagct agcgtcaatg 
20041 gagtcacatt aattggagaa tcagtaaaaa cacagtttaa ctactttaag aaagtagacg 
2 0101 gcattattca acagttgcct gaaacctact ttactoagag cagagactta gaggatttta 
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20161 agcccagatc acaaatggaa actgactttc tcgagctcgc tatggatgaa ttcatacagc 
2 02 21 gatataagct cgagggctat gccttcgaac acatcgttta tggagatttc agtcatggac 
20281 aacttggcgg tcttcattta atgataggct tagccaagcg ctcacaagat tcaccactta 
2 0341 aattagagga ttttatccct atggacagca cagtgaaaaa ttacttcata acagatgcgc 
204 01 aaacaggttc atcaaaatgt gtgtgttctg tgattgatct tttacttgat gactttgtcg 
2 04 51 agataataaa gtcacaagat ttgtcagtga tttcaaaagt ggtcaaggtt acaattgact 
20521 atgctgaaat ttcattcatg ctttggtgta aggatggaca tgttgaaacc ttctacccaa 
2 0581 aactacaagc aagtcaagcg tggcaaccag gtgttgcgat gcctaacttg tacaagatgc 
20641 aaagaatgct tcttgaaaag tgtgaccttc agaattatgg tgaaaatgct gttataccaa 
20701 aaggaataat gatgaatgtc gcaaagtata ctcaactgtg tcaatactta aatacactta 
207 51 ctttagctgt accctacaac atgagagtta tteactttgg tgctggctct gataaaggag 
20821 ttgcaccagg tacagctgtg ctcagacaat ggttgccaac tggcacacta cttgtcgatt 
20881 cagatcttaa tgacttcgtc tccgacgcag attotacttt aattggagac tgtgcaacag 
20941 tacatacggc taataaatgg gaccttatta ttagcgatat gtatgaccct aggaccaaac 
21001 atgtgacaaa agagaatgao tctaaagaag ggtttttcac ttatctgtgt ggatttataa 
21061 agcaaaaact agccctgggt ggttctatag ctgtaaagat aacagagcat tcttggaatg 
21121 ctgaccttta caagcttatg ggccatttct catggtggac agcttttgtt acaaatgtaa 
21181 atgcatcatc atcggaagca tttttaattg gggctaacta tcttggcaag ccgaaggaac 
21241 aaattgatgg ctataccatg catgctaact acattttctg gaggaacaca aatcctatcc 
21301 agttgtcttc ctattcactc tttgacatga gcaaatttco tcttaaatta agaggaactg 
21361 ctgtaatgtc tcttaaggag aatcaaatca atgatatgat ttattctctt otggaaaaag 
21421 gtaggcttat cattagagaa aacaacagag ttgtggtttc aagtgatatt cttgttaaca 
21481 actaaacgaa catgtttatt ttcttattat ttcttactct cactagtggt agtgaccttg 
21541 accggtgcac cacttttgat gatgttcaag ctcctaatta cactcaacat acttcatcta 
21601 tgaggggggt ttactatcct gatgaaattt ttagatcaga cactctttat ttaactcagg 
21661 atttatttct tccattttat tctaatgtta cagggtttca tactattaat catacgtttg 
21721 gcaaccctgt catacctttt aaggatggta tttattttgc tgccacagag aaatcaaatg 
21781 ttgtccgtgg ttgggttttt ggttotacoa tgaacaacaa gtcacagtog gtgattatta 
21841 ttaacaattc tactaatgtt gttatacgag catgtaactt tgaattgtgt gaoaaccctt 
21901 tctttgctgt ttctaaaccc atgggtacac agacacatac tatgatattc gataatgcat 
21961 ttaattgcac tttcgagtac atatctgatg ccttttcgct tgatgtttca gaaaagtcag 
22021 gtaattttaa acacttacga gagtttgtgt ttaaaaataa agatgggttt ctctatgttt 
22081 ataagggcta tcaacctata gatgtagttc gtgatctacc ttctggtttt aacactttga 
22141 aacctatttt taagttgcct cttggtatta acattacaaa ttttagagcc attcttacag 
22201 ccttttcacc tgctcaagac atttggggca cgtcagctgc agcctatttt gttggctatt 
22261 taaagccaac tacatttatg ctcaagtatg atgaaaatgg tacaatcaca gatgctgttg 
22321 attgttctca aaatccactt gctgaactca aatgctctgt taagagcttt gagattgaca 
22381 aaggaattta ccagacctct aatttcaggg ttgttccctc aggagatgtt gtgagattcc 
22441 ctaatattac aaacttgtgt ccttttggag aggtttttaa tgctactaaa ttcccttctg 
22501 tctatgcatg ggagagaaaa aaaatttcta attgtgttgc tgattactct gtgctctaca 
22561 actcaacatt tttttcaacc tttaagtgct atggcgtttc tgccactaag ttgaatgatc 
22621 tttgcttctc caatgtotat gcagattott ttgtagtcaa gggagatgat gtaagacaaa 
22681 tagcgocagg acaaactggt gttattgctg attataatta taaattgcca gatgatttca 
22741 tgggttgtgt ccttgcttgg aatactagga acattgatgc tacttcaact ggtaattata 
22801 attataaata taggtatctt agacatggca agcttaggcc ctttgagaga gacatatcta 
22861 atgtgccttt otcccctgat ggcaaacctt gcaccccacc tgctcttaat tgttattggc 
22921 cattaaatga ttatggtttt tacaccacta ctggcattgg ctaccaacct tacagagttg 
22981 tagtactttc ttttgaactt ttaaatgcac cggccacggt ttgtggacca aaattatcca 
23041 ctgaccttat taagaaccag tgtgtcaatt ttaattttaa tggactcact ggtactggtg 
23101 tgttaactcc ttcttcaaag agatttcaac catttcaaca atttggccgt gatgtttctg 
23161 atttcactga ttccgttcga gatcctaaaa catctgaaat attagacatt tcaccttgct 
23221 cttttggggg tgtaagtgta attacacctg gaaoaaatgc ttcatctgaa gttgctgttc 
23281 tatatcaaga tgttaactgc actgatgttt ctacagcaat tcatgcagat caactcacac 
23341 cagcttggcg catatattct actggaaaca atgtattcca gactcaagca ggctgtctta 
23401 taggagctga gcatgtcgac acttcttatg agtgcgacat tcctattgga gctggcattt 
23461 gtgctagtta ccatacagtt tctttattac gtagtactag ccaaaaatct attgtggctt 
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23521 atactatgtc tttaggtgct gatagttcaa ttgcttactc taataacacc attgctatac 

23581 ctactaactt ttcaattagc attactacag aagtaatgcc tgtttctatg gctaaaacct 

23641 ccgtagattg taatatgtac atctgcggag attctactga atgtgctaat ttgcttctcc 

23701 aatatggtag cttttgcaca caactaaato gtgcactctc aggtattgct gctgaacagg 

23761 accgcaacac acgtgaagtg ttcgctcaag tcaaacaaat gtacaaaacc ccaactttga 

23821 aatattttgg tggttttaat ttttcacaaa tattacctga ccctctaaag ccaactaaga 

23881 ggtcttttat tgaggacttg ctctttaata aggtgacact cgctgatgct ggcttcatga 

23941 agcaatatgg cgaatgccta ggtgatatta atgctagaga tctcatttgt gcgcagaagt 

24001 tcaatggact tacagtgttg ccacctctgc tcactgatga tatgattgct gcctacactg 

24061 ctgctctagt tagtggtact gccactgctg gatggacatt tggtgctggc gctgctcttc 

24121 aaataccttt tgctatgcaa atggcatata ggttcaatgg cattggagtt acccaaaatg 

24181 ttctctatga gaaccaaaaa caaatcgcca accaatttaa caaggcgatt agtcaaattc 

24241 aagaatcact tacaacaaca tcaactgcat tgggcaagct gcaagacgtt gttaaccaga 

24301 atgctcaagc attaaacaca cttgttaaac aacttagctc taattttggt gcaatttcaa 

24361 gtgtgctaaa tgatatcctt tcgcgacttg ataaagtcga ggcggaggta caaattgaca 

24421 ggttaattac aggcagactt caaagccttc aaacctatgt aacacaacaa ctaatcaggg 

244 81 ctgctgaaat cagggottct gctaatottg ctgctactaa aatgtctgag tgtgttcttg 

24541 gacaatoaaa aagagttgac ttttgtggaa agggctaoca cottatgtcc ttcccacaag 

24601 cagccccgca tggtgttgtc ttcctacatg tcacgtatgt gccatcccag gagaggaact 

24661 tcaccacagc gccagcaatt tgtcatgaag gcaaagcata cttccctcgt gaaggtgttt 

24721 ttgtgtttaa tggcacttct tggtttatta cacagaggaa cttcttttct ccaoaaataa 

24781 ttactacaga caatacattt gtctcaggaa attgtgatgt cgttattggc atcattaaca 

24841 acacagttta tgatcctctg caacctgagc ttgactcatt caaagaagag ctggacaagt 

24901 acttcaaaaa tcatacatca ccagatgttg atcttggcga catttcaggc attaacgctt 

24961 ctgtcgtcaa cattcaaaaa gaaattgacc gcctcaatga ggtcgctaaa aatttaaatg 

25021 aatcactcat tgaccttcaa gaattgggaa aatatgagca atatattaaa tggccttggt 

25081 atgtttggct cggcttcatt gctggactaa ttgccatcgt catggttaca atcttgcttt 

25141 gttgcatgac tagttgttgc agttgcctca agggtgcatg ctcttgtggt tcttgctgca 

25201 agtttgatga ggatgactct gagccagttc tcaagggtgt caaattacat taoacataaa 

252 61 ogaacttatg gatttgttta tgagattttt tactcttgga tcaattactg cacagccagt 

25321 aaaaattgac aatgcttctc ctgcaagtac tgttcatgct aoagcaacga taccgctaca 

25381 agcctcactc cctttcggat ggcttgttat tggogttgca tttcttgctg tttttcagag 

254 41 cgctaccaaa ataattgcgc tcaar.aaaag atggcagcta gccctttata agggcttcca 

25501 gttcatttgc aatttactgc tgctatttgt taccatctat tcacatcttt tgcttgtcgc 

25561 tgoaggtaag gaggcgcaat tfcttgtacct ctatgccttg atatattttc tacaatgcat 

25621 caacgcatgt agaattatta tgagatgttg gctttgttgg aagtgcaaat ocaagaaccc 

25681 attactttat gatgccaact aotttgtttg otggcacaca cataactatg actactgtat 

25741 accatataac agtgtcacag atacaattgt cgttactgaa ggtgacggca tttcaacacc 

25801 aaaactcaaa gaagactacc aaattggtgg ttattctgag gataggcact caggtgttaa 

25861 agactatgtc gttgtacatg gctatttcac cgaagtttac taccagcttg agtctacaca 

25921 aattactaca gacactggta ttgaaaatgc tacattcttc atctttaaca agottgttaa 

25981 agacccaccg aatgtgcaaa tacacacaat cgacggctct tcaggagttg ctaatccagc 

26041 aatggatcca atttatgatg agccgacgac gactactagc gtgcctttgt aagcacaaga 

26101 aagtgagtac gaacttatgt actcattcgt ttcggaagaa acaggtacgt taatagttaa 

26161 tagcgtactt ctttttcttg cttccgtggt attcttgcta gtcacactag ccatccttac 

26221 tgcgcttcga ttgtgtgcgt actgctgcaa tattgttaac gtgagtttag taaaaccaac 

26281 ggtttacgtc tactcgcgtg ttaaaaatct gaactcttct gaaggagttc ctgatcttct 

26341 ggtctaaacg aactaactat tattattatt ctgtttggaa ctttaacatt gcttatcatg 

26401 gcagacaacg gtactattac cgttgaggag cttaaacaac tcctggaaca atggaaccta 

264 61 gtaataggtt tcctattcct agcctggatt atgttactac aatttgccta ttctaatcgg 

26521 aaoaggtttt tgtacataat aaagcttgtt ttcctctggc tcttgtggcc agtaacactt 

26581 gcttgttttg tgcttgctgt tgtctacaga attaattggg tgactggcgg gattgcgatt 

26641 gcaatggctt gtattgtagg cttgatgtgg cttagctact tcgttgcttc cttcaggctg 

26701 tttgctcgta cccgctcaat gtggtcattc aacccagaaa caaacattct tctcaatgtg 

26761 cctctccggg ggaoaattgt gaccagaccg otcatggaaa gtgaacttgt cattggtgot 

26821 gtgatoattc gtggtcactt gcgaatggcc ggaoactccc tagggcgctg tgacattaag 
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26881 gacctgccaa aagagatcac tgtggctaca 
26941 gcgtcgcagc gtgtaggcac tgattcaggt 
27001 aactataaat taaatacaga ccacgccggt 
27061 taagtgacaa cagatgtttc atcttgttga 
27121 tatcattatg aggactttca ggattgctat 
27181 agtgagacaa ttatttaagc ctctaactaa 
272 41 acctatggag ttagattatc cataaaacga 
27301 ttgtatttac atcttgcgag ctatatcact 
27361 tactaaaaga accttgccca tcaggaacat 
27421 ctgacaataa atttgcacta acttgcacta 
27481 gtactcgaca tacctatcag ctgcgtgcaa 
27541 aagaggaggt tcaacaagag ctctactcgc 
27601 ttttaatact ttgcttcacc attaagagaa 
27661 cttctatttg tgctttttag cctttctgct 
27721 ttggttttca ctcgaaatcc aggatctaga 
27781 gaaacttctc attgttttga cttgtatttc 
27841 gcgctgtgca tctaataaac ctcatgtgct 
27901 gtaatactta tagcactgct tggctttgtg 
27961 ggcacactat ggttcaaaca tgcacaccta 
28021 gtggtgcgct tatagctagg tgttggtacc 
28081 gagacgtact tgttgtttta aataaacgaa 
28141 tcaaaccaac gtagtgcccc ccgcattaca 
28201 aaccagaatg gaggacgcaa tggggcaagg 
28261 aataatactg cgtcttggtt cacagctctc 
28321 cctcgaggcc agggcgttcc aatcaacacc 
28381 taccgaagag ctacccgacg agttcgtggt 
28441 agatggtaot tctattacct aggaactggc 
28501 aaagaaggca tcgtatgggt tgcaactgag 
28551 ggcacccgca atcctaataa caatgctgcc 
28621 ttgccaaaag gcttctacgc agagggaagc 
286B1 tcatoacgta gtcgcggtaa ttcaagaaat 
287 41 cctgctcgaa tggctagcgg aggtggtgaa 
28801 ttgaaccagc ttgagagcaa agtttctggt 
28861 actaagaaat ctgctgctga ggcatctaaa 
28921 cagtacaacg tcactcaagc atttgggaga 
28981 ggggaccaag acctaatcag acaaggaact 
29041 tttgotccaa gtgcctctgc attctttgga 
29101 tcgggaacat ggctgactta tcatggagcc 
29161 aaagacaacg tcatactgct gaacaagcac 
29221 gagcctaaaa aggacaaaaa gaaaaagact 
29281 aagaagcagc ccactgtgac tcttcttcct 
2 9341 cttcaaaatt ccatgagtgg agcttctgct 
2 94 01 accacacaag gcagatgggc tatgtaaacg 
29461 tactcttgtg cagaatgaat tctcgtaact 
29521 atctcacata gcaatcttta atcaatgtgt 
29581 cattttcatc gaggccacgc ggagtacgat 
29641 ctgcctatat ggaagagccc taatgtgtaa 
29701 attttaatag cttcttagga gaatgacaaa 



tcacgaacgc tttcttatta caaattagga 
tttgctgcat acaaccgcta ccgtattgga 
agcaacgaca atattgcttt gctagtacag 
cttcoaggtt acaatagcag agatattgat 
ttggaatctt gacgttataa taagttcaat 
gaagaattat tcggagttag atgatgaaga 
acatgaaaat tattctcttc ctgacattga 
atcaggagtg tgttagaggt acgactgtac 
acgagggcaa ttcaccattt caccctcttg 
gcaoacactt tgcttttgct tgtgctgacg 
gatcagtttc accaaaactt ttcatcagac 
cactttttct cattgttgct gctctagtat 
agaeagaatg aatgagctca ctttaattga 
attccttgtt ttaataatgc ttattatatt 
agaaccttgt accaaagtct aaacgaacat 
tctatgcagt tgcatatgca ctgtagtaca 
tgaagatcct tgtaaggtac aacactaggg 
ctctaggaaa ggttttacct tttcatagat 
atgttactat caactgtcaa gatcoagctg 
ttcatgaagg tcaccaaact gctgcattta 
caaattaaaa tgtctgataa tggaccccaa 
tttggtggac ccacagattc aactgacaat 
ccaaaacagc gccgacccca aggtttaccc 
actcagcatg gcaaggagga acttagattc 
aatagtggtc oagatgacca aattggctac 
ggtgacggca aaatgaaaga gctcagcccc 
ccagaagctt cacttcccta cggcgctaac 
ggagccttga atacacccaa agaccacatt 
accgtgctac aacttcctca aggaacaaca 
agaggcggca gtcaagcctc ttctogctcc 
toaactoctg gcagoagtag gggaaattct 
actgccctcg ogctattgct gctagacaga 
aaaggccaac aacaacaagg ccaaaotgtc 
aagcctcgcc aaaaacgtac tgccacaaaa 
cgtggtccag aacaaaccca aggaaatttc 
gattacaaac attggccgca aattgcacaa 
atgtcacgca ttggcatgga agtcacacot 
attaaattgg atgacaaaga tccacaattc 
attgacgcat acaaaacatt cccaccaaoa 
gatgaagctc agcctttgcc gcagagacaa 
gcggctgaca tggatgattt ctccagacaa 
gattcaactc aggcataaac actcatgatg 
ttttcgcaat fcccgtttacg atacatagtc 
aaacagcaca agtaggttta gttaacttta 
aacattaggg aggacttgaa agagccacca 
cgagggtaca gtgaataatg ctagggagag 
aattaatttt agtagtgcta tccccatgtg 
aaaaaaaaaa aa 
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1 - ATATTflGGrTTTTACCTACCCAGGAAAAGCCAACCAACCTCGATCTCTTGTAGATCTGTT - 60 
-ILGFYLPRKSQPTSISCRSV 
-Y*VF'i'YPGKANQPRSLVO~>F 
I RFLPTQEKPTNLDLL* ICS 
61 - CTCTMACGAACTTTAAAATCTGTGTAGCTGTCGCTCGGCTGCATGCCTAGTGCACCTAC - 120 

- L * TNFKICVAVARLHA*CTY 
-SKRTLKSV*LSLGCMPSA?T 

LNEL*NLCSCRSAACLVHLR 
121 - GCAGTATAAACftATAATAAATTTTACTGTCGTTGACAAGAAACGAGTAACTCGTCCCTCT - 180 
-AV*TIINFTVVDKKRVTRPS 
-QYKQ**ILLSLTENE*LVPL 
SINNNKFYCR*QETSNSSLF 
181 - TCTGCAGACTGCTTACGGTTTCGTGCGTGTTGCAGTCGATCATCAGCATACCTAGGTTTC - 240 
-SADCLRFRPCCSRSSAYLGF 
-LQTAYGFVRVAVDHQHT*VS 
CRLLTVSSVLQSIISIPRFR 
241 - GTCCGGGTGTGACCGAAAGGTAAGATGGAGAGCCTTGTTCTTGGTGTCAACGAGAAMCA - 300 

- V R V * PKGKMESLVLGVNEKT 
-SGCDRKVRWRALFLVSTRKH 

PGVTER* DGEPCSWCQRENT 
301 - CACGTCCAACTCAGTTTGCCTGTCCTTCAGGTTAGAGACGTGCTAGTGCGTGGCTTCGGG - 360 
-HVQLSLPVLQVRDVLVRGFG 
-TSNSVCLSFRLETC*CVASG 
RPTQFACPSG*RRASAWLRG 
361 - GACTCTGTGGAAGAGGCCCTATCGGAGGCACGTGAACACCTCAAAAATGGCACTTGTGGT - 420 
-D5VEEALSEAREHLKNGTCG 
-TLWKRPYRRHVNTSKMALVV 
LCGRGPIGGT*TPQKWHLWS 
421 - CTAGTAGAGCTGGAAAAAGGCGTACTGCCCCAGCTTGAACAGCCCTATGTGTTCATTAAA - 480 
-LVELEKGVLPQLEQPYVFIK 

- * *SWKKAYCPSLNSPMCSLN 

SRAGKRRTAPA*TALCVH*T 
4 81 - CGTTCTGATGCCTTAAGCACCAATCACGGCCACAAGGTCGTTGAGCTGGTTGCAGAAATG - 540 
-RS DALSTNHGHKVVELVAEM 
-VLMP*APITATRSLSWLQKW 
F*CLKHQSRPQGR*AGCRNG 
541 - GACGGCATTCAGTACGGTCGTAGCGGTATAACACTGGGAGTACTCGTGCCACATGTGGGC - 600 
-DGIQYGRSGITLGVLVPHVG 
-TAFSTVVAV*HWEYSCHMWA 
RHSVRS*RYNTGSTRATCGR 
601 - GAAACCCCAATTGCATACCGCAATGTTCTTCTTCGTAAGAACGGTAATAAGGGAGCCGGT - 660 

- E T PIAYRNVLLRKNGNKGAG 
-KPQLHTAMFFFVRTVIREPV 

NPNCIPQCSSS*ER**GSRW 
661 - GGTCATAGCTATGGCATCGATCTAAAGTCTTATGACTTAGGTGACGAGCTTGGCACTGAT - 720 
-GHSYGIDLKSYDLGDELGTD 
-VIAMASI*SI.MT*VTSLALI 
S*LWHRSKVIi*LR*RAWH*S 
721 - CCCATTGAAGATTATGAACAAAACTGGAACACTAAGCATGGCAGTGGTGCACTCCGTGAA - 780 
-PIEDYEQNWNTKHGSGALRE 

- PLKIMNKTGTLSMAVVHSVN 

H*RL*TKLEH*AWQWCTP*T 
781 - CTCACTCGTGAGCTCAATGGAGGTGCAGTCACTCGCTATGTCGACAACAATTTCTGTGGC - 840 
-LTRELNGGAVTRYVDNNFCG 
-SLVSSMEVQSLAMSTTISVA 

HS*AQWRCSHSLCRQQFLWP 
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841 - CCAGATGGGTACCCTCT TGATTGCATCAAAGATTTTCTCGCACGCGCGGGCAAGTCAATG - 900 
-PDGYPLDCIKDFLARAGKSM 
-QMGTLLIASKIFSHARASQC 
R M V ? 3 * L H 0 R F S R T R G Q V N V 
901 - TGCACTCTTTCCGAACAACTTGATTACATCGAGTCGAAGAGAGGTGTCTACTGCTGCCGT - 960 
-CTLSEQLDYIESKRGVYCCR 
-ALFPNNLITSSRREVSTAAV 
HS FRTT*LHRVEERCLLLP* 
961 - GACCATGAGCATGAAATTGCCTGGTTCACTGAGCGCTCTGATAAGAGCTACGAGCACCAG - 1020 
-DKEHE IAWFTERSDKSYEHQ 
-TMSMKLPGSLSALIRATSTR 
p*A*NCLVH*AL**ELRAPD 
1021 - ACACCCTTCGAAATTAAGAGTGCCMGMATTTGACACTTTCAAAGGGGAATGCCCAAAG - 1080 
-TPFEIKSAKKFDTFKGECPK 
-H PSKLRVPRNLTLSKGNAQS 
TLRN*ECQEI*HFQRGMPKV 
1081 - TTTGTGTTTCCTCTTAACTCAAAAGTCAAAGTCATTCAACCACGTGTTGAAAAGAAAAAG - 1140 
-FVFP LNSKVKVI QPRVEKKK 
-LCFLLTQKSKSFNHVLKRKR 
CVSS*LKSQSHSTTC*KEKD 
1141 - ACTGAGGGTTTCATGGGGCGTATACGCTCTGTGTACCCTGTTGCATCTCCACAGGAGTGT - 1200 
-TEGFMGRIRSVYPVASPQEC 
-LRVSWGVYALCTLLHLHRSV 
*GFHGAYTLCVPCCISTGV* 
1201 - AACAATATGCACTTGTCTACCTTGATGAAATGTAATCATTGCGATGAAGTTTCATGGCAG - 1260 
-NNMHLSTLMKCNHCDEVSWQ 
-TICTCLP**NVIIAMKFfiGR 
QYALVYLDEM*SLR* SFMAD 
1261 - ACGTGCGACTTTCTGAAAGCCACTTGTGAACATTGTGGCACTGAAAATTTAGTTATTGAA - 1320 
-TCDFLKATCEHCGTENLVIE 
-RATF*KPLVNIVALKI*LIiK 
VRLSESHL*TL WH*KFSY*R 
1321 - GGACCTACTRCATGTGGGTACCTACCTACTAATGCTGTAGTGAAAATGCCATGTCCTGCC - 1380 
-GPTTCGYLPTNAVVKMPCPA 
-DLL HVGTYLLML* * K C H V L P 
TYYMWVPTY*CCSENAMSCL 
1381 - TGTCAAGACCCAGAGATTGGACCTGAGCATAGTGTTGCAGATTATCACAACCACTCAAAC - 1440 
-CQDPEIGPEHSVADYHNHSN 
-VKTQRLDLSIVLQIITTTQT 
SRPRDWT*A*CCRLSQPLKH 
1441 - ATTGAAACTCGACTCCGCAAGGGAGGTAGGACTAGATGTTTTCGAGGCTGTGTGTTTGCC - 1500 
-IETRLRKGGRTRCFGGCVFA 
-LKLDSAREVGLDVLEAVCLP 
*NSTPQGR*D*MFWRLCVCL 
1501 - TATGTTGGCTGCTATAATAAGCGTGCCTACTGGGTTCCTCGTGCTAGTGCTGATATTGGC - 1560 
-YVGCYNKRAYWVPRASADIG 
-MLAAIISVPTGFLVLVLILA 
CWLL**ACLLGSSC*C*YWL 
15 SI - TCAGGCCATACTGGCATTOCTGGTGACAATGTGGAGACCTTGAATGAGGATCTCCTTGAG - 1620 
-SGHTGITGDNVETLNEDL LE 
-QAILALLVTMWRP*MRISLR 
RPYWHYW i QCGDLE*GSP*D 
1621 - ATACTGAGTCGTGAACGTGTTAACATTAACATTGTTGGCGATTITGATTrGAATGAAGAG - 1680 
-ILSRERVNINIVGDFHLNEE 
-Y*VVNVLTLTLLAIFI*MKR 
TES*TC*H*HCWRFSFE*RG 
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1681 - GTTGCCATCATTTTGGCATCTTTCTCTGCTTCTACAftGTGCCTTTATTGACACTATAAAG - 1740 
-VAIILASFSASTSAFIDTIK 
-LPSFWHLSLLLQVPLLTL*R 
CHHFGI FLCFYKCLY* HYKE 

1741 - AGTCTTGATTACAAGTCTTTCAAAACCATTCTTGAGTCCTGCiGGTAACTATAAAGTTACC - 1800 
-SLDYKSFKTIVESCGNYKVT 

- VLITSLSKPLLSPAVTIKLP 

S*LQVFQNHC*VLR*Ii*SYQ 
1801 - AAGGGAAAGCCCGTAAAAGGTGCTTGGAACATTGGACAACAGAGATCAGTTTTAACACCA - 18 60 
-KGKPVKGAWNIGQQRSVLTP 
-RESP*KVLGTLDNRDQF*HH 
GKARKRCLBHWTTEISFNTT 
1861 - CTGTGTGGTTTTCCCTCACAGGCTGCTGGTGTTATCAGATCAATTTTTGCGCGCACACTT - 1920 
-LCGFPSQAAGVI RS I FARTL 
-CVVFPHRLLVIiSDQFLRAHL 
VWFSLTGCWCYQINFCAHT* 
1921 - GATGCAGCAAACCACTCAATTCCTGATTTGCAAAGAGCAGCTGTCACCATACTTGATGGT - 1980 
-DAANHSIPDLQRAAVTILDG 
-MQQTTQFLICKEQLSPYLMV 
CSKPLNS*FAKSSCHHT*WY 
1981 - ATTTCTGAACAGTCATTACGTCTTGTCGACGCCATGGTTTATACTTCAGACCTGCTCACC - 2040 
-ISEQSLRLVDAMVYTSDLLT 
-FLNSHYVLSTPWFILQTCSP 
F * TVITSCRRHGLYFRPAHQ 
2041 - AACAGTGTCATTATTATGGCATATGTnACTGGTGGTCTTGTACAACAGACTTCTCAGTGG - 2100 
-NSVI IMAYVTGGLVQQTSQW 
-TVSLLWHM*LVVLYNRLI<SG 
QCHYYGICNWWSCTTDFSVV 
2L01 - TTGTC1AATCTTTTGGGCACTACTGTTGAAAAACTCAGGCCTATCTTTGAAXGGATTGAG - 2160 
-LSNLLGTTVEKLRPIFEWIE 
-CLI FWALLLKNSGLSLNGLR 
V* SFGHYC*KTQAYL*MD*G 
2161 - GCGAAACTTAGTGCAGGAGTTGAATTTCTCAAGGATGCTTGGGAGATTCTCAAATTTCTC - 2220 
-AKLSAGVEFLKDAWEILKFL 
-RNLVQELNFSRMLGRFSNFS 
ET*CRS*ISQGCLGDSQISH 
2221 - ATTACAGGTGTTTTTGACATCGTCAA3GGTCAAATACAGGTTGCTTCAGATAACATCAAG - 2280 
-ITGVFDIVKGQIQVASDNIK 
-LQVFLTSSRVKYRLLQITSR 
YRCF*HRQGSNTGCFR*HQG 
2281 - GATTGTGTAAAATGCTTCATTGATGTTGTTAACAAGGCACTCGAAATGTGCATTGATCAA - 2340 
-DCVKCFI DVVNKALEMCIDQ 
-IV*NASLMLLTRHSKCALIK 
LCKMLH*CC*QGTRNVH*SS 
2341 - GTCACTATCGCTGGCGCAAAGTTGCGATCACTCAACT1AGGTGAAGTCTTCATCGCTCAA - 2400 
-VTIAGAKLRSLNLGEVFIAQ 
-SLSLAQSCDHST*VKSSSLK 
HYRWRKVAITQLR* SLHRSK 
2 401 - AGCAAGGGACTTTACCGTCAGTGTATACGTGGCAAGGAGCAGCTGCAACTACTCATGCCT - 2460 
-SKGLYRQCIRGKEQLQLLMP 
-ARDFTVSVYVARSSCNYSCL 

- QGTLP SVYTWQGAAATTHAS 

2 461 - CTTAAGGCACCAAAAGAAGTAACCTTTCTTGAAGGTGATTCACATGACACAGTACTTACC - 252 0 
-LKAPKEVTFLEGDSHDTVLT 
-LRHQKK* PFLKVIHMTQYLP 
*GTKRSNLS*R* FT*HSTYL 
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-LRRLFSRTVNSKHSRRPLIA 
*GGCSQER*TRSTRDAR**L 
2581 - TTCACAAATGGAGCTATCGTCGGCACACCAGTCTGTGTAAATGGCCTCATGCTCTTAGAG - 2640 
-FTNGAIVGTPVCVNGLMLLE 
-SQMELSSAHQSV*MASCS*R 
KKW3YRRHTSLCKWPKALRD 
2641 - ATTAAGGACAAAGAACAATACTGCGCATTGTCTCCTGGTTTACTGGCTACAAACAATGTC - 2700 
-IKDKEQYCALSPGLLATNNV 
-LRTKNNTAHCLLVYWLQTMS 
*GQRTILRIVSWFTGYKQCL 
2701 - TTTCGCTTMAAGGGGGTGCACCAATTAAAGGTGTAACCTTTGGAGAAGATACTGTTTGG - 2760 
-FRLKGGAP IKGVTFGEDTVW 
-FA*KGVHQLKV*PLEKILFG 
SLKRGCTN*RCNLWRRYCLG 
2761 - GAAGTTCAAGGTTACAAGAATGTGAGAATCACATTTGAGCTTGATGAACGTGTTGACAAA - 2820 
-EVQGYKNVRITFELDERVDK 
-KFKVTRM*E5HLSLMNVLTK 



-VLNEKCSVYTVESGTEVTEF 
-CLMKSALSTLLNPVPKLLSL 
A**KVLCLHC*IRYRSY*VC 
2881 - GCATGTGTTGTAGCAGAGGCTGTTGTGAAGACTTTACAACCAGTTTCTGATCTCCTTACC - 2940 
-ACVVAEAVVKTLQPVSDLLT 
-HVL*QRLL*RLYNQFLISLP 
MCCSRGCCEDFTTSF*SPYQ 
2 941 - AACAXGGG'IATTGATCT'1'GATGAGTGGAGTGTAGCTACATTCTACTTATTTGATGATGCT - 3000 
-NMGIDLDEWSVATFYLFDDA 
-TWVLILMSGV*LHSTYLMML 
HGY*S**VECSYILLI* * C W 
3001 - GGTGAAGAAAACTTTTCATCACGTATGTATTGTTCCTTTTACCCTCCAGATGAGGAAGAA - 3060 
-GEENFSSRMYCSFYPPDEEE 
-VKKTFHHVCIVPFTLQMRKK 
*RKLFITYVLFLLPSR*GRR 
3061 - GAGGACGATGCAGAGTGTGAGGAAGAAGAAATTGATGAAACCTGTGAACATGAGTACGGT - 3120 
-EDDAECEEEEIDETCEHEYG 
-RTMQSVRKKKLMKPVNMSTV 
GRCRV*GRRN i *NL*T*VRY 
3121 - ACAGAGGATGATTATCAAGGTCTCCCICTGGAATTTGGTGCCTCAGCTGAAACAGTTCGA - 318 D 
-TEDDIQGLPLEFGASAETVR 
-QRMI 1KVSLWNLVPQLKQFE 
RG*LSRSPSGIWCLS*NSSS 
3181 - GTTGAGGAAGMGAAGAGGAAGACTG3CTGGATGATACTACTGAGCAATCAGAGATTGAG - 3240 
-VEEEEEEDWLDDTTEQSEIE 
-LRKKKRKTGWMILLSNQRLS 
*GRRRGRLAG*'YY*AIRD*A 
3241 - CCAGAACCAGAACCTACACCTGAAGAACCAGTTAATCAGTTTACTGGTTATTTAAAACTT - 3300 
-PEPEPTPEEPVNQFTGYLKL 
-QNQNLHLKNQI>ISLLVI*NL 
RTRTYT*RTS*SVYWLFKTY 
3301 - ACTGACAATGTTGCCATTAAATGTGTTGACATCGTTAAGGAGGCACAAAGTGCTAATCCT - 3360 

- T DNVAI KCVDIVKEAQSANP 

- ltmlplnvltslrrhkvi.il 

*QCCR*MC*HR*GGTKC*SY 



FIG. 11 Con'r 



WO 2004/085650 



PCT/CN2004/000246 



23/106 

3361 - ATGGTGATTGTAAATGCTGCTAACATACACCTGAAACATGGTGGTGGTGTAGCAGGTGCA - 3420 
-MV I VNAAN I HLKHGGGVAGA 
-W*L*MLLTYT*NMVVV*QVH 
3 D C K C C * H T P E T W W K C S R G T 
3421 - CTCAACAAGGCAACCAATGGTGCCATGeAAAAGGAGAGTGATGATTACATTAAGCTAAAT - 3480 
-LNKATHGAMQKESDDYIKLN 
-STRQPMVPCKRRVMITLS*M 
QQGNQWCHAKGE* * L H * A K W 
3481 - GGCCCTCT1ACAGTAGGAGGGTCTTGTTTGCTTTCTGGACATAATCTTGCTAAGAAGTGT - 3540 
-GPliTVGGSCLLSGHNLAKKC 
-ALLQ*EGLVCFLDIILLRSV 
PSYSRRVLFAFWT*SC*EVS 
3541 - CTGCATGTTGTTGGACCrAACCTAAATGCAGGTGAGGACATCCftGCTTCTTAAGGCAGCA - 3600 
-LHVVGPN LNAGE DI QLLKAA 
-CMLLD1T*MQVRTSSFLRQH 
ACCWT*PKCR*GHPAS*GSI 
3 601 - TATGAAAATTTCAATTCACAGGACATCTTACTTGCACCATTGTIGTCAGCAGGCATATTT - 3660 
-YENFNSQDILL APLLSAGIF 
-MKISIHRTSYLHHCCQQAYL 
±KFQFTGHLTCTIVVSRHIW 
3S61 - GGTGCTAAACCACTTCAGTCTTTACAAGTGTGCGTGCAGACGGTTCGTACACAGGTTTAT - 3720 
-GAKPLQSLQVCV QTVRTQVY 
-VLNHFSLYKCACRRFVHRFI 
C*TT3VFTSVRADGSYTGLY 
3721 - ATTGCAGTCAATGACAAAGCTCTTTATGAGCAGGTTGTCATGGATTATCTTGATAACCTG - 3780 
-IAVNDKALYEQVVMDYLDNL 
-LQSMTKLFMSRLSWIILIT* 
CSQ*QSSL*AGCHGI,S**PE 
3781 - AAGCCTAGAGTGGAAGCACCTAAACAAGAGGAGCCACCAAACACAGAAGATTCCAAAACT - 3840 
-KPRVEAPKQEEPPNTEDSKT 
-SLEWKHLNKRSHQTQKIPKL 
A*SGST*TRGATKHRRFQN* 
3841 - GAGGAGAAATCTGTCGTACAGAAGCCTGTCGATGTGAAGCCAAAAATTAAGGCCTGCATT - 3900 
-EEKSVVQKPVDVKPKIKACI 
-RRNLSYRSLSM* SQKLREAL 
GEICRTEACRCEAKW*GLH* 
3901 - GATGAGGTTACCACAACACTGGAAGAAACTAAGTTTCTTACCAATAAGTTACTCTTGTTT - 3960 
-DEVTTTLEETKFLTNKLLLF 
-MRLPQHWKKLSFIjPISYSCL 
*GYHNTGRN + VSYQ*VTLVC 

3 961 - GCTGATATCAATGGTAAGCTTTACCATGATTCTCAGAAOATGCTTAGAGGTGAAGATATG - 4 02 0 

-ADINGKLYHDSQNMLRGEDM 
-LISMVSFTMILRTCLEVKIC 
*YQW*ALP*FSEHA*R*RYV 

4 021 - TCTTTCCTTGAGAAGGATGCACCTTACATGGTAGGTGATGTTATCACTAGTGGTGATATC - 4080 

-SFLEKDAPYMVGDVITSGDI 
-LSLRRMHLTW*VMIiSLVVIS 
FP*EGCTLBGR*CYH*W*YH 
4 081 - ACTTGTGTTGTAATACCCTCCAAAAAGGCTGGTGGCACTACTGAGATGCTCTCAAGAGCT - 414 0 
-TCVVIPSKKAGGTTEMLSRA 
-LVL*YPPKRLVALLRCSQEL 
LCCNTLQKGWWHY*DAliKSF 
4141 - TTGAAGAAAGTGCCAGTTGATGAGTATATAACCACGTACCCTGGACAAGGATGTGCTGGT - 4200 
-LKKVPVDEYITTYPG QGCAG 
* RKCQLMS I*PRTLDKDVLV 
EESAS**VYNHVPWTRMCWL 



FIG. 11 Coirt 



WO 2004/085650 



PCT/CN2004/000246 



24/106 

4201 - TATACACTTGAGGAAGCTAAGACTGCTCTTftAGAAATGCAAATCTGCATTTTATGTACTA - 4260 
-YTLEEAKTALKKCKSAFYVL 
-IHLRKLRLLLRNANLHFMYY 
YT*GS*DCS*EMQICILCTT 
4261 - CCTTCAGAAGCACCTAATGCTAAGGAAGAGATTCTAGGAACTGTATCCTGGAATTTGAGA - 4320 
-PSEAPNAKEEILGTVSWNLR 
-LQKHIiMLRKRF*EIiYPGI*E 
FRST*C*GRDSRNCILEFER 
4321 - GAAATGCTTGCTCATGCTGAAGAGACAAGAAAATTAATGCCTATATGCATGGATGTTAGA - 4380 
-EMLAHAEETRKLMP I CMDVR 
-KCLLMLKRQEN*CLYAWMLE 
NACSC*RDKKINAYMHGC + S 
4 381 - GCCATAATGGCAACCATCCAACGTAAGTATAAAGGAATTAAAATTCAAGAGGGCATCGTT - 4440 
-AIMATIQRKYKGIKIQEGIV 
-P*WQPSNVSIKELKFKRASL 
HNGNHPT*V*RN*NSRGHR* 
4441 - GACTATGGTGTCCGATTCTTCTTTTATACTAGTAAAGAGCCTGTAGCTTCTATTATTACG - 4500 
-DYGVRFFFYTSKEPVASIIT 
-TMVSDSSFILVKSL*LLLLR 
LWCPILLLY* *RACSFYYYE 
4501 - AAGCTGAACTCTCTAAATGAGCCGCTTGTCACAATGCCAATTGGTTATGTGACACATGGT - 4560 
-KLNSLNEPLV TMPIGYVTHG 
- S*-TL*MSRLSQCQLVM*HMV 
AELSK*AACHNANWLCDTWF 
4561 - TTTAATCTTGAAGAGGCTGCGCGCTGTATGCGTTCTCTTAAAGCTCCTGCCGTAGTGTCA - 4 620 
-FNLEEAARCMRSLKAPAVVS 
-LILKRLRAVCVLLKLLP*CQ 
*S*RGCALYAFS*SSCRSVS 
4621 - GTATCATCACCAGATGCTGTTACTACATATAATGGATACCTCACTTCGTCATCAAAGACA - 4 680 
-VSSPDAVTTYNGYLTSSSKT 
-YHHQMLLLHIMDTSLRHQRH 
IITRCCYYI*WIPHFVIKDI 
4681 - TCTGAGGAGCACTTTGTAGAAACAGTTTCTTTGGCTGGCTCTTACAGAGATTGGTCCTAT - 4740 
-SEEHFVETVSLAGSYRDWSY 
-LRSTL*KQFLWLALTEIGPI 
^GALCRNSFFGWLLQRLVLF 
4741 - TCAGGACAGCGTACAGAGTTAGGTGTTGAATTTCTTAAGCGTGGTGACAAAATTGTGTAC - 4800 
-SGQRTELGVEFLKRGDKIVY 
-QDSVQS*VLNFLSVVTKLCT 
RTAYRVRC*IS*AW*QNCVP 
4 801 - CACACTCTGGAGAGCCCCGTCGAGTTTCATCTTGACGGTGAGGTTCTTTCACTTGACAAA - 48 60 
-HTLESPVEFHLDGEVLSLDK 
-TLWRAPSSFI1TVRFFHLTN 
HSGEPRRVSS*R*GSFT*QT 
4861 - CTAAAGAGTCTCTTATCCCTGCGGGAGGTTAAGACTATAAAAGTGTTCACAACTGTGGAC - 4920 
-LKSLLSLREVKTIKVFTTVD 
-*RVSYPCGRLRL*KCSQLWT 
KESLIPAGG* DYKSVHNCGQ 
4921 - AACACTAATCTCCACACACAGCTTGTGGATATGTCTATGACATATGGACAGCAGTTTGGT - 4980 
-NTNLHTQLVDMSMTYGQQFG 
-TLISTHSLWICL*HMDSSLV 
H*SPHTACGYVYDIWTAVWS 
4 981 - CCAACATACTTGGATGGTGCTGATGTTACAAAAATTAAACCTCATGTAAATCATGAGGGT - 5040 
-PTYLDGADVTKIKPHVNHEG 
-QHTWMVLMLQKLNLM* IMRV 
NILGWC*CYKN*TSCKS*G* 
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5041 - AAGACTTTCTTTGTACTACCTAGTGATGACACACrACGTAGTGAAGCTTTCGAGTACTAC - 5100 
-KTFFVLPSDDTLRSEAFEYY 
-RLSLYYLVMTHYVVKLSSTT 
DPLCTT***HTT**SFRVLP 
5101 - CATACTCTTGATGAGAGTTTTCTTGGTAGGTACATGTCTGCTTTAAACCACACAAAGAAA - 5160 
-HTLDESFLGRYMSALNHTKK 
-ILLMRVFLVGTCliI)*TTQRN 
YS**EFSW*VHVCFKPHKEM 
5161 - TGGMATTTCCTCAAGTTGGTGGTTTAACTTCAATTAAATGGGCTGATAACAATTGTTAT - 5220 
-WKFPQVGGLTSIKWADNNCY 
-GNFLKLVV*LQLNGLITIVI 
EISSSWWFNFN*MG**QLLF 
5221 - TTGTCTAGTGTTTTATTAGCACTTCAACAGCTTGAAGTCAAATTCAATGCACCAGCACTT - 5280 
-LSSVLLALQQLEVKFNAPAL 
-CLVFY* HFNSLKSNSMHQHF 
V*CFISTSTA*SQIQCTSTS 
5281 - CAAGAGGCTTATTATAGAGCCCGTGCTGGTGATGCTGCTAACTTTTGTGCACTCATACTC - 5340 
-QEAYYRARAGDAAN FCALIL 
-KRLI IE PVLVMLLTFVHSYS 
RGLL*SPCW*CC*LLCTHTR 
5341 - GCTTACAGTAATAAAACTGTTGGCGAGCTTGGTGATGTCAGAGAAACTATGACCCATCTT - 5400 
-AYSNKTVGELGDVRETMTHL 
-LTVIKLLASLVMSEKL*PIF 
LQ**NCWRAW*CQRNYDPSS 
5401 - CTACAGCATGCTAATTTGGAATCTGCAAAGCGAGrTCTTAATGTGGTGTGTAAACATTGT - 54 60 
-LQHANLESAKRVLNVVCKHC 
-YSMLIWNLQSEFLMWCVNIV 
TAC*FGICKASS*CGV*TLW 
5461 - GGTCAGAAAACTACTACCTTAACGGGTGTAGMGCTGTGATGTATATGGGTACTCTATCT - 5520 
-GQKTTTLTGVEAVMYMGTLS 
- VRKLLP*RV*KIj*CIWVLYL 
SENYYLNGCRSCDVYGYSIL 
5521 - TATGATAATCTTAAGACAGGTGTTTCCATTCCATGTGTGTGTGGTCGTGATGCTACACAA - 5580 
-YDNLKTGVSIPCVCGRDATQ 
-MIILRQVFPFHVCVVVMLHN 
**S*DRCFHSMCVWS*CYTI 
5581 - TATCTAGTACAACAAGAGTCTTCTTTIGTTATGATGTCTGCACCACCTGCTGAGTATAAA - 5640 
-YLVQQESSFVMMSAPPAEYK 
-I*YNKSLLLL*CI)HHLLSIN 
SSTTRVFFCYDVCTTC*V*I 
5641 - TTACAGCAAGGTACATTCTTATGTGCGMTGAGTACACTGGTAACTATCAGTGTGGTCAT - 5700 
-LQQGTFLCANEYTGNYQCGH 
-YSKVHSYVRMSTLVTISVVI 
TARYILMCE*VHW*LSVWSL 
5701 - TACACTCATATAACTGCTAAGGAGACCCTCTATCGTATTGACGGAGCTCACCTTACAAAG - 5760 
-YTHITAKETLYRIDGAHLTK 
-TLI*LLRRPSIVLTELTLQR 
HSYNC*GDPLSY*RSSPYKD 
5761 - ATGTCAGAGTACAAAGGACCAGTGACTGATGTTTTCTACAAGGAAACATCTTACACTACA - 5820 
-MSEYKGPVTDVFYKETSYTT 
-CQSTKDQ*LMFSTRKHLTLQ 
VRVQRTSD'*CFLQGNILHYN 
5821 - ACCATCAAGCCTGTGTCGTATAAACTCGATGGAGTTACTTACACAGAGATTGAACCAAAA - 5880 
-TIKPVSYKLDGVTYTEIEPK 
-PSSLCRINSMELLTQRLNQN 
HQACVV*TRWSYLHRD*TKI 
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5881 - TTGGATGGGTATTATAAAAAGGATAATGCTTACTATACAGAGCAGCCTATAGACCTTGTA - 5940 
-LDGYYKKDNAYYTEQPIDLV 
W M G I I K R I M L T I Q S S L " T L Y 
GWVL*KG*CliLYRAAYRPCT 
5941 - CCRACTCAACCATTACCAAATGCGAGTTTTGATAATrrCAAACXCACflTGTTCTAACACft - 6000 
-PTQPLPNASFDNFKLTCSNT 
-QLNHYQMRVLI ISNSHVLTQ 
NSTITKCEF* *FQTHMF*HK 
6001 - AAATTTGCTGATGATTTAAATCAAATGACAGGCTTCACAAAGCCAGCTTCACGAGAGCTA - 6060 
-KFADDLNQMTGFTKPASREL 
-NLLMI*IK*QASQSQLHESY 
IC**FKSNDRI>HKASFTRAI 
6061 - TCTGrCACATTCTTCCCAGACTTGAATGGCGATGTAGTGGCTATTGACTATAGACACTAT - 6120 
-SVTFFPDLNGDVVAIDYRHY 
-LSHSSQT*MAM*WLLTIDTI 
CHILPRLEWRCSGY*L*TLF 
6121 - TCAGCGAGTTTCAAGAAAGGTGCTAAATTACTGCATAAGCCAATTGTTTGGCACATTAAC - 6180 
-SASFKKGAKLLHKPIVWHIN 
-QRVSRKVLNYCISQLFGTLT 
SEFQERC* ITA*ANCLAH*P 
6181 - CAGGCTACAACCAAGACAACGTTCAAACCAAACACTTGGTGTTTACGTTGTCTTTGGAGT - 6240 
-QATTKTTFKPNTWCLRCLWS 
-RLQPRQRSNQTLGVYVVFGV 
GYNQDNVQTKHLVFTLSLEY 
6241 - ACAAAGCCAGTAGATACTTCAAATTCATTTGAAGTTCTGGCAGTAGAAGACACACAAGGA - 6300 
-TKPVDTStJ SFEVLAVEDTQG 
-QSQ* ILQIHLKFWQ*KTHKE 
KASRYFKFI*SSGSRRHTRN 
6301 - ATGGACAATCTTGCTTGTGAAAGTCAACAACCCACCTCTGAAGAAGTAGTGGAAAATCCT - 6360 
-MDNLACESQQPTSEEVVENP 
-WTILLVKVNNPPLKK*WKIL 
GQSCL*KSTTHL*RS5GK3Y 
6361 - ACCATACAGAAGGAAGTCATAGAGTGTGACGTGAAAACTACCGAAGTTGTAGGCAATGTC - 6420 
-TIQKEVIECDVKTTEVVGNV 
-PYRRKS*SVT*KLPKL*AMS 
HTEGSHRV*RENYRSCRQCH 
6421 - ATACTTAAACCATCAGATGAAGGTGTTAAAGTAACACAAGAGTTAGGTCATGAGGATCTT - 64 80 
-ILKPSDEGVKVTQELGHEDL 
-YLNHQMKVLK*HKS * V M R I L 
T + TIR*RC*SNTRVRS*G'SY 
6481 - ATGGCTGCTTATGTGGAAAACACAAGCATTACCATTAAGAAACCTAATGAGCTTTCACTA - 6540 
-MAAYVENTSITIKKPNELSL 
-WLLMWKTQALPLRNLMSFH^ 
GCLCGKHKHYH*ET**AFTS 
6541 - GCCTTAGGTTTAAAAACAATTGCCACTCATGGTATTGCTGCAATTAATAGTGTTCCTTGG - 6600 
-ALGLKTIATHGIAAINSVPW 
-P*V*KQLPLMVLLQLIVFLG 
LRFKNNCHSWYCCN * * CS LE 
6601 - AGTAAAATTTTGGCTTATGTCAAACCATTCTTAGGACAAGCAGCAATTACAACATCAAAT - 6660 
-SKILAYVKPFLGQAAITTSN 
-VKFWLMSNHS*DKQQLQHQI 
*NFGLCQT11)RT53NYNIKL 
6661 - TGCGCTAAGAGATTAGCACAACGTGTGITTAACAATTATATGCCTTATGTGTTTACATTA - 6720 
-CAKRLAQRVFNNYMPYVFTL 
- A L R D * HNVC1TI ICLMCLHY 
R*EISTTCV*QLYALCVYII 
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6721 - TTGTTCCRATTGTGTACTTTTACTRAftAGTACCAATTCTAGAATTftGAGCTrCACrACCT - 6780 
-LFQLCTFTKSTNSRIRASLP 
-CSNCVLLLKVPILELELHYL 
VPIVYFY*KYQF*N*SFTTY 
6781 - ACAACTATTGCTAAAAATAGTGTTAAGAGTGTTGCTAAATTATGTTTGGATGCCGGCATT - 684 0 
-TTIAKNSVKSVAKLCLDAGI 
-QLLLKIVLRVLLNYVWMPAL 
NYC*K*C* ECC* IMFGCP. H * 
6841 - AATTATGTGAAGTCACCCAAATTTTCTAAATTGTTCACAATCGCTATGTGGCTATTGTTG - 6900 
-NYVKSPKFSKLFTIAMWIiIiIi 
-IM*SHPNFLNCSQSLC 3YCC 
LCEVTQIF*IVHNRYVAIVV 
6901 - TTAAGTATTTGCTTAGGT'ICTCTAATCTGTGTAACTGCTGCTTTTGGTGTACTCTrATCT - 6960 
-LSICLGSLICVTAAFGVLLS 

- *VFA*VL* SV-*LLLLVYSYL 

KYLLRFSNLCNCCFWCTLI* 
6961 - AATTTTGGTGCTCCTTCTTATTGTAATGGCGTTAGAGAATTGTATCTTAATTCGTCTAAC - 7020 
-NFGAPSYCNGVRELYLNSSN 

- ILVLLLIVMALENCILIRLT 

FWCSFLL*WR*RIVS*FV*R 
7021 - GTTACIACTATGGATTTCTGTGAAGGTTCTTTTCCTTGCAGCATTTGTTTAAGTGGATTA - 7080 
-VTTMDFCEGSFPCSIC1SGL 
-LLLWISVKVLFLAAFV*VD* 
YYYGFL^RFFSLQHLFKWIR 
7081 - GACTCCCTTGATTCTrATCCAGCTCTTGAAACCATTCAGGTGACGATTTCATCGTACAAG - 7140 
-DSLDSYPALETIQVTISSYK 

- T PLI LIQLLKPFR*RFHRT S 

LP*FLSSS*NHSGDDFIVQA 
7141 - CTAGACTTGACAATTTTAGGTCTGGCCGCTGAGTGGGTTTTGGCATATATGTTGTTCACA - 7200 
-LDLTILGLAAEWVLAYMLFT 
-*T*QF*VWPLSGFWHICCSQ 
RLDNFRSGR*VGFGIYVVHK 
7201 - AAATTCTTTTAT?TTATTAGGTCTTTCAGCTATAATGCAGGTGTTCTTTGGCTATTTTGCT - 7260 
-KFFYLLGLSAIMQVFFGYFA 
-NSFIY*VFQL*CRCSLAILL 
ILLFIRSFSYNAGVLWLFC* 
7261 - AGTCATTTCATCAGCAATTCTTGGCTCATGTGGTTTATCATTAGTATrGTACAAATGGCA - 7320 
-SHFISNSWLMWFI ISIVQMA 

- VI SSAILGSCGLSLVLYKWH 

SFHQQFLAHVVYH*YCTNGT 
7321 - CCCGTTTCTGCAATGGTTAGGATGTACATCTTCTTTGCTTCTTTCTACTACATATGGAAG - 7380 
-PVSAMVRMYI FFASFYYIWK 
-PFLQWLGCTSSLLLSTTYGR 
RFCNG* DVHLLCFFLLHMEE 
7381 - AGCTATGTTCATATCATGGATGGTTGCACCTCTTCGACTTGCATGATGTGCTATAAGCGC - 7440 
-SYVHIMDGCTSSTCMMCYKR 
-AMFISWMVAPLRLA + CAISA 
LCSYHGWLHLFDLHDVL*AQ 
7441 - AATCGTGCCACACGCGTTGAGTGTACAACTATTGTTAATGGCATGAAGAGATCTTTCTAT - 7500 
-NRATRVECTTIVNGMKRSFY 

- IVPHALSV<2LLLMA*RDLSM 

SCHTR*VYNYC*WHEEIFLC 
7501 - GTCTATGCAAATGGAGGCCGTGGCTTCTGCAAGACTCACAATTGGAATTGTCTCAATTGT - 7560 
-VYAWGGRGFCKTHNWNCLNC 
-SKQMEAVASARLTIGIV5IV 
ICKWRPWLLQDSQLELSQL* 
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7561 - GACflCATTTTGCACTGGTAG,TACATTCATTAGTGATGAAGTTGCTCGTGATTTGTCACTC - 7 620 
-DTFCTGSTFISDEVARDLSL 
-THFALVVHSLVMKL-jVICHS 
HILHW*YIH±**SCS*FVTP 
7621 - CAGTTTAAAAGACCAATCAACCCTACTGACCAGTCATCGTATATTGTTGATAGTGTTGCT - 7680 
-QFKRPINPTDQSSYIVDSVA 
-SLKDQSTLLTSHRILLIVLL 
V*KTNQPY*PVIVYC**CCC 
7681 - GTGAAAAATGGCGCGCTTCACCTCTACTTTGACAAGGCTGGTCAAAAGACCTATGAGAGA - 7740 

- V K N GALHLYFDKAGQKTYER 

- * KMARFTSTLTRLVKRPMRD 

EKWRASPLL*QGWSKDL i 'ET 
7741 - CATCCGCTCTCCCATTTTGTCAATTTAGACAATTTGAGAGCTAACAACACTAAAGGTTCA - 7800 
-HPLSHFVNLDN1RANNTKGE 
-IRSPILSI*TI 4 ELTTLKVH 
SALPFCQFRQFES^'QH^RFT 
7801 - CTGCCTATTAATGTCATAGTTrTTGATGGCAAGTCCAAATGCGACGAGTCTGCTTCTAAG - 7860 
-LPINVIVFDGKSKCDESASK 
-CLLMS*FLMASPNATSLLLS 
AY*CHSF*WQVQMRRVCF*V 
7861 - TCTGCTTCTGTGTACTACAGTCAGCTGATGTGCCAACCTATTCTGTTGCTTGACCAAGCT - 7920 
-SASVYYSQLMCQPlliLLDQA 
-LLLCTTVS *CANLFCCLTKL 
CFCVLQSADVPTYSVA*PSS 
7 921 - CTTGTATCAAACGTTGGAGATAGTACTGAAGTTTCCGTTAAGATGTTTGATGCrTATGTC - 798 0 
-LVSNVGDSTEVSVKMFDAYV 
-LYQ?LEIVLKFPLRCLMLMS 
CIKRWR*Y*SFR*DV*CLCR 
7981 - GACACCTTTTCAGCAACTTTTAGTGTTCCTATGGAAAAACTTAAGGCACTTGTTGCTACA - 8040 
-DTFSATFSVPMEKLKALVAT 
-TPFQQLLVFLWKNLRHLLLQ 
HLFSNF*CSYGKT'*GTCCYS 
8041 - GCTCACAGCGAGTTAGCAAAGGGTGTAGCTTTAGATGGTGTCCTTTCTACATTCGTGTCA - 8100 
-AHSELAKGVALDGVLSTFVS 
-LTAS*"QRV*L*MVSFLHSCQ 
SQRVSKGCSFRWCPFYIRVS 
8101 - GCTGCCCGACAAGGTG1TGTTGATACCGATGTTGACACAAAGGATGTTATTGAATGTCTC - 8160 
-AARQGVVDTDVDTKDVIECL 

- LPDKVLLIPMLTQRMLLNVS 

CPTRCC*YRC*HKGCY*MSQ 
8161 - AAACTTTCACATCACTCTGACTTAGAAGTGACAGGTGACAGTTGTAACAATTTCATGCTC - 8220 
-KLSHHSDLEVTGDSCNNFML 
-NFH ITLT*K*QVTVVTISCS 
TFTSL t LRSDR*QL*QFHAH 
8221 - ACCTATAATAAGGTTGAAAACATGACGCCCAGAGATCTTGGCGCATGTATTGACTGTAAT - 828 0 
-TYNKVENMTPRDLGAC I DCN 
-PIIRLKT*RPEILAHVLTVM 
L**G*KHDAQRSWRMY*L*C 
8281 - GCAAGGCATATCAATGCCCAAGTAGCAAAAAGTCACAATGTITCACTCATCTGGAATGTA - 8340 
-ARHINAQVAKSHNVSLIWNV 
-QGISMPK*QKVTMFHSSGM* 
KAYQCPSSKKSQCFTHLECK 
8341 - AAAGACTACATGTCTTTATCTGAACAGCTGCGTAAACAAATTCGTACTGCTGCCAAGAAG - 8400 
-KDYMSLSEQLRKQIRTAAKK 
-KTTC IiYLNSCVNKFVLLPRR 
RLHVFI*TAA*TNSYCCQEE 
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8401 - AACAACATACCTTTTACACTAACTTGTGCTACAACTAGACAGGTTGTCAATGTCATAACT - 8460 

- N N IP FTLTCATTRQVVNVI T 
-TTYLLH^LVLQLDRLSMS k L 

QHTFYTNLCYN* TGCQCHNY 
8 4 SI - ACTAAAATCTCACTCAAGGGTGGTAAGATTGTTAGTACTTGTTTTAAACTTATGCTTAAG - 8520 
-TKISLKGGKIVSTCFKLMLK 
-LKSHSRVVRLLVLVLNLCLR 
*NLTQGW*DC*YLF*TYA*G 
8521 - GCCACATTATTGTGCGTTCTTGCTGCATTGGTTTGTTATATCGITATGCCAGTACATACA - 8580 
-ATLLCVLAALVCYIVMPVHT 
-PHYCAFLLHWFVISLCQYIH 
HIIVRSCCIGLLYRYASTYI 
8581 - TTGTCAATCCATGATGGTTACACAAATGAAATCATTGGTTACAAAGCCATTCAGGATGGT - 8640 
-LSIHDGYTNEIIGYKAIQDG 
-CQSMMVTQMKSLVTKPFRMV 
VNP*WLHK*NHWLQSHSGWC 
8641 - GTCACTCGTGACATCATTTCTACTGATGATTGrTTTGCAAATAAACATGCTGGTTTTGAC - 8700 
-VTRDIISTDDCFANKHAGFD 
-SLVTSFLLMIVLQINMLVLT 
HS*HHFY**LFCK*TCWF*R 
8701 - GCATGGTTTAGCCAGCGTGGTGGTTCATACAAAAATGACAAAAGCTGCCCTGTAGTAGCT - 8760 
-AWFSQRGGSYKNDKSCPVVA 
-HGLASVVVHTKMTKAAL* * L 
MV*PAWWFIQK*QKLPCSSC 
87 61 - GCTATCATTACAAGAGAGATTGGTTTCATAGTGCCTGGCTTACCGGGTACTGTGCTGAGA - 8 82 0 
-Al ITREIGFIVPGLPGTVLR 
-LSLQERLVS*CI>AYRVLC*E 
YHYKRDWFHSAWLTGYCAES 
8821 - GCAATCAATGGTGACTTCTTGCATTTICTACCTCGTGTTITTAGTGCTGTTGGCAACATT - 8B8 0 
-AINGDFLHFLPRVFSAVGNI 
-QSMVTSCIFYIiVFLVLLATF 

- NQW*LLAFSTSCF*CCWQHL 

8881 - TGCTACACACCTTCCAAACTCATTGAGTATAGTGATTTTGCTACCXCTGCTTGCGTTCTT - 894 0 
-CYTPSKLIEYSDFATSACVLi 
-ATHLPNSLSIVILLPLLAFL 
LHTFQTH*V* *FCYLCLRSC 

8 941 - GCTGCTGAGTGTACAATTTTTAAGGATGCTATGGGCAAACCTGTGCCATATTGTTATGAC - 9000 
-AAECTIFKDAMGKPVPYCYD 
-LLSVQFLRMLWANLCHIVMT 
C*VYNF*GCYGQTCAILL*fl 

9001 - ACTAATTTGCTAGAGGGTTCTATTTCTTATRGI'GAGCTTCGTCCAGACACTCGTTATGTG - 9060 
-TNLLEGSISYSELRPDTRYV 
-LIC*RVLFLIVSFVQTLVMC 
* FARGFYFL* *ASSRHSLCA 

9061 - CTTATGGATGGTTCCATCATACAGTT1CCTAACACTTACCTGGAGGGTTCTGTTAGAGTA - 9120 
-LMDGSI IQFPNTYLEGSVRV 
-LWMVPSYSFLTLTWRVLLE* 

- YGWFHHTVS*HIiPGGFC*SS 

9121 - GTAACAACTTTTGATGCTGAGTACTGIAGACATGGTACATGCGAAAGGTCAGAAGTAGGT - 9180 
-VTTFDAEYCRHGTCERSEVG 
-*QLLMLSTVDMVHAKGQK*V 

- NNF*C*VL*TWYMRKVRSRY 

9181 - ATTTGCCTATCTACCAGTGGTAGATGGGTTCTTAATAATGAGCATTACAGAGCTCTATCA - 924 0 
-ICLSTSGRWVLNNEEYRAL S 
-FAYLPVVDGFLIMSITELYQ 
LPIYQW*MGS***ALQSSIR 
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9241 - GGAGTTTTCTGTGGTGTTGATGCGATGAATCTCATAGCTAAGATCTTTACTCCTCTTGTG - 9300 
-GVFCGVDAMNLIANIFTPLV 
-EF5VVLMR*IS*LT3LLLLC 
SFLNC' CDESHS*HLYSSCA 
9301 - CAACCTGTGGGTGCTTTAGATGTGTCTGCTTCAGTAGTGGCTGGTGGTATTATTGCCATfl - 9360 
-QPVGALDVSASVVAGG I IAI 
-NLWVL.*MCLLQ*WLVVLLPY 
TCGCFRCVCFSSGWWYYCHI 
9361 - TTGGrGACTTGTGCTGCCTACTACTTTATGAAATTCAGACGTGTTTTTGGTGAGTACAAC - 9420 
-LVTCAAY YFMKFRRVFGEYN 
-W*LVLPTTL*NSDVFIjVSTT 
GDLCCLLLYEIQTCFW*VQP 
94 21 - CATGTTGTTGCTGCTAATGCACTTTTGTTTTTGATGTCTTTCACTATACICTGTCTGGTA - 9480 
-HVVAANALLFLMSFTILCLV 
-MLLLLMHFCF*CLSLYSVWY 
CCCC*CTFVFDVFHYTLSGT 
94 81 - CCAGCTTACAGCTTTCTGCCGGGAGTCTACTCAGTCTTTTACTTGTACTTGACATTCTAT - 954 0 
-PAYSFLPGVYSVFYLYLTFY 
-QLTAFCRESTQSFTCT*HSI 
SLQLSAGSLLSILLVLDILF 
9541 - TTCACCAATGATGTTTCATTCTTGGCTCACCTTCAATGGTTTGCCATGTTTTCTCCTATT - 9600 
-FTNDVSFLAHLQWFAMFSPI 
-SPMMFHSWLTFNGLPCFLLL 
HQ*CFILGSPSMVCHVFSYC 
9601 - GTGCCTTTTTGGATAACAGCAATCTATGTATTCTGTATTTCTCTGAAGCACTGCCATTGG - 9660 
-VPFWITAIYVFCISLKHCHW 
-CLFG*QQSMYSVFL*STAIG 

- AFLDNSNLCILYFSEALPLV 

9661 - TTCTTTAACAACTATCTTAGGAAAAGAGTCATGTTTAATGGAGTTACATTTAGTACCTTC - 9720 
-FFNNYLRKRVMFNGVT FSTF 
-SLTTILGKESCLMELHLVPS 
L*QLS*EKSHV*WSYI*YLR 

9721 - GAGGAGGCTGCTTTGTGTACCTTTTTGCTCAACAAGGAAATGTACCTAAAATTGCGTAGC - 9780 
-EEAALCTFLLNKEMYLKLRS 
-RRLLCVPFCSTRKCT*NCVA 
GGCFVYLFAQQGNVPKIA*R 

9781 - GAGACACTGTTGCCACTTACACAGIATAACAGGTATCTTGCTCTATATAACAAGTACAAG - 9840 
-ETLLPLTQYNRYLALYNKYK 

- RHCCHLHSITGILLYITSTS 

DTVATYTV*QVSCSI *QVQV 
9841 - TATTTCAGTGGAGCCTTAGATACTACCAGCTATCGTGAAGCAGCTTGCTGCCACTTAGCA - 9900 
-YFSGALDTTSYREAACCHLA 

- I S V E P * ILPAIVKQLAAT*Q 

FQWSLRYYQLS* SSLLPLSK 
9901 - AAGGCTCTAAArGACTTTAGCAACTCAGGTGCTGATGTTCTCTACCAACCACCACAGACA - 9960 
-KALNDFSNSGADVLYQPPQT 
-RL*MTLATQV1MFSTNHHRH 
GSK i L'"QLRC*CSLPTTTDI 
9961 - TCAATCACTTCTGCTGTTCTGCAGAGTGGTTTTAGGAAAATGGCATTCCCGTCAGGCAAA - 10020 
-SITSAVLQSGFRKMAFPSGK 

- QSLLLFCRVVLGKWHSRQAK 

NHFCCSAEWF i ENGI PVRQS 
10021 - GTTGAAGGGTGCATGGTACAAGTAACCTGTGGAACTACAACTCTTAATGGATTGTGGTTG - 10080 
-VEGCMVQVTCGTTTLNGLWL 
-LKGAWYK* PVELQLLMDCGW 

* R V H G T S N L W N Y N S * W I V V G 
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10081 - GATGACACAGTATftCTGTCCAAGACATGTCATTTGCACAGCAGAAGACATGCTTAATCCT - 1014 0 
-DDTVYCPRHVICTAEDMLNP 
-MTQYTVQDMSFAQQKTCLIL 
*HSILSKTCHIiHSRRHA*S* 
10141 - AACTATGMGATCTGCTCATTCGCftAATCCAACCATAGCTTTCTTGTTCAGGCTGGCAAT - 10200 
-NYEDLL IRKSNHSFLVQAGN 
-TMKICSFANPTIAFLFRLAM 
L*RSAHSQIQP-LSCSGWQC 
10201 - GTTCAACTTCGTGTTATTGGCCATTCTATGCAAAATTGTCTGCTTAGGCTTAAAGTTGAT - 10260 
-VQLRV IG H SMQN CLLRliKVD 
-FNFVLLAILCKIVCLGL KIiI 
STSCYWPFYAKLSA*A*S*Y 
10261 - ACTTCTAACCCTAAGACACCCAAGTATAAATTTGTCCGTATCCAACCTGGTCAAACATTT - 10320 
-TSNPKTPKYKFVRIQPGQTF 
-LLTLRHPSINLSVSNLVKHF 
F *p*DTQV*ICPYPTWSNIF 
10321 - TCAGTTCTAGCATGCTACAATGGTTCACCATCTGGTGTTTATCAGTGTGCCATGAGACCT - 10380 
-SVLACYNGSPSGVYQCAMRP 
-QF^HATMVHHLVFISVP^DL 
SSSMLQWFTIWCLSVCHET* 
10381 - AATCATACCATTAAAGGTTCTTTCCTTAATGGATCATGTGGTAGTGTTGGTTTTAACATT - 10440 
-NHTIKGSFLNGSCGSVGFNI 

- IIPLKVLSLMDHVVVLVLTL 

SYH*RFFP*WIMW*CWF* H * 
10441 - GATTATGATTGCGTGTCTTTCTGCTATATGCATCATATGGAGCTTCCAACAGGAGTACAC - 10500 
-DYDCVSFCYMHHMELPTGVH 

- IMIACLSAICIIWSFQQEYT 

L*LRVFLLYASYGASNRSTR 
105 Ql - GCTGGTACTGACTTAGAAGGTAAATTCTATGGTCCATTTGTTGACAGACAAACTGCACAG - 105 SO 
-AGTDLEGKFYGPFVDRQTAQ 
-LVLT*KVNSMVHL1TDKLHR 

- WY*LRR*ILWSIC*QTNCTG 

10561 - GCTGCAGGTACAGACACAACCATAACATTAAATGTTTTGGCATGGCTGTATGCTGCTGTT - 10620 
-AAGTDTT I TLNVLAWLYAAV 
-LQVQTQP*'H*MFWHGCMLLIi 
CRYRHNHNIKCFGMAVCCCY 
10621 - ATCAATGGTGATAGGTGGTTrCTTAATAGATTCACCACTACTTTGAATGACTTTAACCTT - 10680 
-INGDRWFLNRFTTTLNDFNL 
-SMVIGGFLIDSPLL*MTLTL 
QW**VVS**IHHYFE*I>*PC 
10681 - GTGGCAATGAAGTACAACTATGAACCTTTGACACAAGATCATGTTGACATATIGGGACCT - 10140 
-VAMKYNYE PLTQDHVDILGP 
-WQ*STTMNL*HKIMLTYWDL 
GNEVQL*TFDTRSC*HIGTS 
10741 - CTTTCTGCTCAAACAGGAATTGCCGTCTTAGATATGTGTGCTGCTTTGAAAGAGCTGCTG - 10800 
-LSAQTGIAVLDMCAALKELL 
-FLLKQELPS*ICVLL*KSCC 
FCSNRNCRLRYVCCFERAAA 
10801 - CAGAATGGTATGAATGGTCGTACTATCCTTGGTAGCACTATTTTAGAAGATGAGTTTACA - 10860 
-QNGMNGRTILGSTILEDEFT 
-RMV*MVVLSLVALF*KMSLH 
EWYEWSYYPW*HYFRR*VYT 
10861 - CCATTTGATGTTGTTAGACAATGCTCTGGTGTTACCTTCCAAGGTAAGTTCAAGAAAATT - 10920 
-PFDVVRQCSGVTFQGKFKKI 
-HLMLLDKALVLPSKVSSRKL 
I *CC*TMLWCYLPR*VQENC 
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10921 - GTTAAGGGCACTCATCATTGGATGCTTTTMCTTIGTTGflCATCACTATTGATTCTTGTT - 10980 
-VKGTHHWMLLTFLTSILILV 
-LRALIIGCF*LS*HHY*FLF 
* GHSSLDAFNFLDITIDSCS 

10981 - CAAAGTACACAGTGGTCACTGTTTTTCTTTGTTTACGAGAATGCTTTCTTGCCATTTACT - 11040 
-QSTQWSLFFFVYENAFLPFT 

- K V H S G H C F S L F T R M L S C H u L 

KYTVVTVFLCLRECFLAIYS 
11041 - CTTGGTATTATGGCAATTGCTGCATGTGCTATGCTGCTTGTTAAGCATAAGCACGCATTC - 11100 
-LGIMAIAACAMLLVKHKHAF 
-LVLWQLLHVLCCLLSISTHS 
WYYGNCCMCYAAC*A*ARIL 
11101 - TTGTGCTTGTTTCTGTTRCCTTCTCTTGCAACAGTTGCTTACTTTAATATGGTCTACATG - 11160 
-LCLFLLPSLATVAYFNMVYM 
-CACFCYLLLQQLLTLIWSTC 
VLVSVTFSCNSCLL* YGLHA 
11161 - CCTGCTAGCTGGGTGATGCGTATCATGACATGGCTTGAATTGGCTGACACTAGCTTGTCT - 11220 
-PASWVMRIMTWLELADTSLS 
-LLAG*CVS*HGLNWLTLACL 
C* LGDAYHDMA* IG*H*LVW 
11221 - GGTTATAGGCTTAAGGATTGTGTTATGTATGCTTCAGCTTTAGTTTTGCTTATTCTCATG - 11280 
-GYRLKDCVMYASALVLLILM 
-VIGLRIVLCMLQL*FCLFS* 
L*A*GLCYVCFSFSFAYSHD 
11281 - ACAGCTCGCACTGTTTATGAIGATGCTGCTAGACGTGTTIGGACACTGATGAATGTCATT - 11340 
-TARTVYDDAARRVWTLMNVI 
-QLALFMMMLLDVFGH* *MSL 
SSHCL±*CC±TCLDTDECHY 
11341 - ACACTTGTTTACAAAGTCTACTATGGTAATGCTTTAGATCAAGCTATTTCCATGTGGGCC - 11400 
-TLVYKVYYGNALDQAISMWA 
-HLFTKSTMVML* IKLFPCGP 
TCLQSLLW*CFRSSYFHVGL 
11401 - TTAGTTATTTCTGTAACCTCTAACTATTCTGGTGTCGTTACGACTATCATGTTTTTAGCT - 114 SO 
-LVISVTSNYSGVVTTIMFLA 
-*LFL*PLTILVSLRLSCF*L 
SYFCNL*1FWCRYDYBVFS* 
11461 - AGAGCTATAGTGTTTGTGTGTGTTGAGTATTACCCATTGTTATTTATTACTGGCAACACC - 11520 
-RAIVFVCVEYYPLLFITGNT 
-EL*C3jCVLSITHCYLLLATP 
5YSVCVC*VLPIVIYYWQHL 
11521 - TTACAGTGTRTCATGCTTGTTTMTGTTTCTTAGGCTATTGTTGCTGCTGCTACTTTGGC - 11580 

- L Q C IMLVYCFLGYCCCCYEG 
-YSVSCLFIVS*AIVAAATLA 

TVYHACLLFLRLLLLLL'LWP 
11581 - CTTTTCTGTT1ACTCAACCGTTACTTCAGGCT1ACTCTTGGTGTTTATGACTACTTGGTC - 11640 
-LFCLLNRYFRLTLGVYDYLV 
-FSVYSTVTSGLLLVFMTTVJS 
FLFTQPLLQAYSWCL*LLGL 
11641 - TCTACACAAGAATTTAGGTATATGAACTCCCAGGGGCTTTTGCCTCCTAAGAGTAGTATT - 11700 
-STQEFRYMNSQGLLPPKSSI 
-LHKNLGI*TPRGFCLLRVVL 
YTRI*VYELPGAFAS*E J 'Y* 
11701 - GATGCTTTCAAGCTTAACATTAAGTTGTTGGGTAT1GGAGGTAAACCATGTATCAAGGTT - 11760 
-DAFKLNIKLLGIGGKPCIKV 
-MLSSLTLS'CWVLEVNHVSRL 
CFQA*H*VVGYWR*TMYQGC 
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11761 - GCTACTGTACAGTCTAAAATGrCTGACGTAAAGTGCACATCTGTGGTACTGCTCTCGGTT - 11820 
-ATVQSKMSDVKCTSVVLLSV 
-LLYSIiKCLT'SAHliWYCSRF 
YCTV*NV*RKVHICGTALGS 
11821 - CTTCAACAACTTAGAGTAGAGTCATCTTCTAAATTGTGGGCACAATGTGTACAACTCCAC - 11880 
-LQQLRVESSSKLWAQCVQLH 
-FNNLE*SHLLNCGHNVYNST 
STT*SRVIF*IVGTMCTTPQ 
11881 - AATGATATTCTTCTTGCAAAAGACACAACTGAAGCTTTCGAGAAGATGGTTTCTCTTTTG - 11940 
-NDILLAKDTTEAFEKMVSLL 
-M I FFLQKTQLKLSRRW F L F C 
* YSSCKRHN* SFREDGFSFV 
11941 - TCTGTTTTGCTATCCATGCAGGGTGCTGTAGACATTAATAGGTTGTGCGAGGAMrGCTC - 12000 
-SVLLSMQGAVD1NRLCEEML 
-LFCYPCRVL*TLIGCARKCS 
CFAIHAGCCRH* *VVRGNAR 
12001 - GATAACCGTGCTACTCTTCAGGCTATTGCTTCAGAATTTAGTTCTTTACCATCATAIGCC - 12060 
-DNRATLQAIASEFSSLPSYA 
-ITVLLFRLLLQNLVLYHHMP 
*PCYSSGYCFRI*FFTIICR 
12061 - GCTTATGCCACTGCCCAGGAGGCCTATGAGCAGGCTGTAGCTAATGGTGATTCTGAAGTC - 12120 
-AYATAQEAYEQAVANGDSEV 
-LMPLPRRPMSRL*LMVILKS 
LCHCPGGL*AGCS*W*F*SR 
12121 - GTTCTCAAAAAGTTAAAGAAATCTTTGAATGTGGCTAAATCTGAGTTTGACCGTGATGCT - 12130 
-VLKKLKKSLNVAKSEFDRDA 
-FSKS*RNL*MWLNLSLTVML 
SQKVKEIFECG*I*V*P*CC 
12181 - GCCATGCAACGCAAGTTCGAAAAGATGGCAGATCAGGCTATGACCCAAATCTACAAACAG - 12240 
-AMQRKLEKMADQAMTQMYKQ 
-PCNASWKRWQIRL* PKCTNR 
HATQVGKDGRSGYDPNVQTG 
12241 - GCAAGATCTGAGGACAAGAGGGCAAAAGTAACTAGTGCTATGCAAACAATGCTCTTCACT - 12300 
-ARSE DKRAKVTSAMQTMLFT 
-QDLRTRGQK*LVLCKQCSSL 
KI * GQEGKSN*CYANNALHY 
12301 - ATGCTTAGGAAGCTTGATAATGA? GCACTTAACAACATTATCAACAATGCGCGTGArGGT - 12360 
-MLRKLDNDALNNI INNARDG 
-CLGSLIMMHLTTL5TMRVMV 
A*EA***CT*QHYQQCA*WL 
12361 - TGTGTTCCACTCAACATCATACCATTGACTACAGCAGCCAAACTCATGGTTGTTGTCCCT - 12420 
-CVPLNIIPLTTAAKLMVVVP 
-VFHSTSYH*LQQPNSWLLSL 
CSTQHHTIDYSSQIHGCCP* 
12421 - GATTATGGTACCTACAAGAACACTTGTGATGGTAACACCTTTACATATGCATCTGCACTC - 12480 
-DYGTYKNTCDGNT FTYASAL 
-IMVPTRTLVMVTPLHMHLHS 
LWYLQEH1*W*HLYICICTL 
12481 - TGGGAAATCCAGCAAGTTGTTGATGCGGATAGCAAGATTGTTCAACTTAGTGAAATTAAC - 12540 
-WEIQQVVDADSKIVQLSEIN 
-GKSSKLLMRIARLFNLVKLT 

- GNPASC J -CG J 'QDCSI A *N*H 

12541 - ATGGACAATTCACCAAATTTGGCTTGGCCTCTTATTGTTACAGCTCTAAGAGGCAACTCA - 12600 

- M D N S PNLAWPLIVTALRANS 
-WTIHQIWLGLLLLQL*EPTQ 

GQFTKFGLASYCYSSKSQLS 
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12601 - GCTGTTAflACTaCflGAATAATGAACTGAGTCCAGTAGCACraCGACAGATGTCCTGTGCG - 12660 
-AVKLQNNSLSPVALRQMSCA 
-LLNYRIMN*VQ*HYDRCPVR 
C*TTB**TBSSSTTTDVLCG 
126S1 - GCTGGTACCACACAAACAGCTTGTRCTGATGACAATGCACTTGCCTACTATAACAATTCG - 12720 
-AGTTQTACTDDNALAYYNNS 
-LVPHKQLVLMTMHLPTITIR 
WYHTNSLY**QCTCLL*QFE 
12721 - AAGGGAGGTAGGTTTGTGCTGGCATTACTATCAGACCACCAAGATCTCAAATGGGCTAGA - 12780 
-KGGRFVLALLSDHQDLKWAR 
-REVGLCWHYYQ TTKI SNGLD 
GR^VCAGITIRPPRSQMG* I 
12781 - TTCCCTAAGAGTGATGGTACAGGTACAATTTACACAGAACTGGAACCACCTTGTAGGTTT - 12840 
-FPKS DGTGTIYTBLEPPCRF 
-SLRVMVQVQFTQNWNHLVGL 
P*E*WYRYNLHRTGTTL*VC 
12841 - GTTACAGACACACCAAAAGGGCCTAAAGTGAAATACTTGTACTTCATCAAAGGCTTAAAC - 12900 
- V T DT PKG P KV KYLYFI KGLN 
-LQTHQKGLK*NTCTSSKA*T 
YRHTKRA*SEILVLHQRLKQ 
12 901 - AACCTAAATAGAGGTATGGTGCTGGGCAGTTTAGCTGCTACAGTACGTCTTCAGGCTGGA - 12960 
-NLNRGMVLGSLAATVRLQAG 
-T* IEVWCWAV*LLQYVFRIiE 
PK*RYGAGQFSCYSTSSGWK 
12961 - AATGCTACAGAAGTACCTGCCAATTCAACTGTGCTTTCCTTCTGTGCTTTTGCAGTAGAC - 13020 
-NATEVPANSTVLSFCAFAVD 
-MLQKYLPIQLCFPSVLLQ*T 
CYRSTCQFNCAFLLCFCSRP 
13021 - CCTGCTAAAGCATATAAGGATTACCTAGCAAGTGGAGGACAACCAATCACCAACTGTGTG - 13080 
-PAKAYKDYLASGGQPITNCV 
-LLKHIRIT*QVEDNQSPTV* 
C * S I *GLPSKWRTTNHQLCE 
13081 - RAGATGTTGTGTACACACACTGGTACAGGACAGGCAATTACTGTAACACCAGAAGCTAAC - 13140 
-KMLCTHTGTGQAITVTPEAN 
-RCCVHTLVQDRQLL* HQKLT 
DVVYTHWYRTGNYCNTRS*H 
13141 - ATGGACCAAGAGTCCTTTGGTGGTGCTTCATGTTGTCTGTATTGTAGATGCCACATTGAC - 13200 
-MDQESFGGASCCLYCRCHID 
-WTKSPLVVLHVVCIVDATLT 
GPRVLWWCFMLSVI,*MPH*P 
13201 - CATCCAAA.TCCTAAAGGATTCTGTGACTTGAAAGGTAAGTACGTCCAAATACCTACCACT - 13260 
-HPNPKGFCDLKGKYVQI PTT 
-IQI1,KDSVT*KVSTSKYLPL 
SKS*RIL*LER*VRPNTYHL 
13261 - TGTGCTAATGACCCAGTGGGTTTTACACTTAGAAACACAGTCTGTACCGTCTGCGGAATG - 13320 
-CANDPVGFTLRNTVCTVCGM 
-VLMTQWVLHLETQ5VPSAEC 
C **PSGFYT*KHSLYRLRNV 
13321 - TGGAAAGGTTATGGCTGTAGTTGTGACCAACTCCGCGAACCCTTGATGCAGTCTGCGGAT - 13380 
-WKGYGCSCDQLREPLMQSAD 
-GKVMAVVVTNSANP* CSLRM 
ERLWL*L' t PTPRTLDAVCGC 
13381 - GCATCAACGTTTTTAAACGGGTTTGCGGTGTAAGTGCAGCCCGTCTTACACCGTGCGGCA - 13440 
-ASTFLNGFAV*VQPVLHRAA 
-HQRF*TGLRCKCSPSYrVRH 
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13441 - CAGGCACTAGTACTGATGTCGTCTACAGGGCTrTTGATATTTACAACGAAAAAAGTGCTG - 13500 
-QALVLMSSTGLLIFTTKKVL 
-RH*Y i CRlQGF*YLQRKKCW 
GTSTDVVYRAFDIYNEKSAG 
13501 - GTTTTGCAAAGTTCCTAAAAACTAATTGCTGTCGCTTCCAGGAGAAGGATGAGGAAGGCA - 13560 
-VLQSS * K L I A V A S R R R M R K A 
-FCKVPKN*LLSLPGEG*GRQ 
FAK FLKTNCCRFQEKDEEGN 
13561 - ATTTATTAGACTCrTACTTTGTAGTTAAGAGGCATACTATGTCTAACTACCAACATGAAG - 13620 
-IY*TLTL*LRGILCLTTNMK 
-FIRLLLCS* : EAYYV*LPT*R 
LLDSYFVVKRHTMSNYQHEE 
13621 - AGACTATTTATAACTTGGTTAAAGATTGTCCAGCGGTTGCTGTCCATGACTTTTTCAAGT - 13680 
-RLFITWLKIVQRLLSMTFSS 

- DYL*LG*RLSSGCCP*LFQV 

TIYNLVKDCPAVAVHDFFKF 
13 681 - TTAGAGTAGATGGTGACATGGTACCACATATATCACGTCAGCGTCTAACTAAATACACAA - 13740 
-LE^MVTWYHIYHVSV* LNTQ 
-*SRW*HGTTYITSASN*IHN 
RVDGDMVPHISRQRLTKYTM 
13741 - TGGCTGATTTAGTCTATGCTCTACGTCATTTTGATGAGGGTAATTGTGATACATTAAAAG - 13800 
-WLI*SMLYVILMRVIVIH*K 
-G*FSLCSTSF-**G*L*YIKR 
ADLVYALRHFDEGNCDTLKE 
13801 - AAATACTCGTCACATACAATTGCTGTGATGATGATTATTTCAATAAGAAGGATTGGTATG - 13860 
-KYSSHTIAVMMIISIRRIGM 
-NTRHIQLL* **LFQ*EGLV* 
ILVTYNCCDDDYFNKKDWYD 
13861 - ACTTCGTAGAGAATCCTGACATCTTACGCGTATATGCTAACTTAGGTGAGCGTGTACGCC - 13920 
-TS*RILTSYAYMLT*VSVYA 
-LRRES*HLTRIC*LR*ACTP 
FVENPDILRVYANLGERVRQ 
13921 - AATCATTATTAAAGACTGTACAATTCTGCGATGCTATGCGTGATGCAGGCATTGTAGGCG - 13980 
-NHY*RLYNSAMLCVMQAL*A 

- I I IKDCTILRCYA*CRHCRR 

SLLKTVQFCDAMRDAGIVGV 
13981 - TACTGACATTAGATAATCAGGATCTTAATGGGAACTGGTACGATTTCGGTGATTTCGTAC - 14040 
-Y*H*IIRILMGTGTISVISY 
-TDIR*SGS*HELVRFR*FRT 
LTLDNQDLNGNWYDFGDFVQ 
14041 - AAGTAGCACCAGGCTGCGGAGTTCCTATTGTGGATTCATATTACTCATTGCTGATGCCCA - 14100 
-K*HQAAEFLLWIHITHC*CP 
-SSTRLRSSYCGFILLIADAH 
VAPGCGVPIVDSYYSLLMPI 
14101 - TCCTCACTTTGACTAGGGCATTGGCTGCTGAGTCCCATATGGATGCTGATCTCGCAAAAC - 14160 
-SSL*LGHWLLSPIWMLISQN 

- PHFD*GIGC*VPYGC*SRKT 

LTLTRALAAESHMDADLAKP 
14161 - CACTTATTAAGTGGGATTTGCTGAAATATGATTTTACGGAAGAGAGACTTTGTCTCTTCG - 14220 
-HLLSGIC*NMILRKRDFVSS 
-TY*VGFAEI*FYGRETLSLR 
LIKWDLLKYDFTEERLCLFD 
14221 - ACCGTTATTTTAAATATTGGGACCAGACATACCATCCCAATTGTATTAACTGTTTGGATG - 14230 
-TVILNIGTRHTIPIVLTVWM 
PLF* ILGPDIPSQLY* LFG* 
RYFKYWDQTYHPNCINCLDD 
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14281 - ATAGGTGTATCCTTCATTGTGCAAACTTTAATGTGTTATTTTCTACTGTGTTTCCACCTA - 14340 
-IGVSFIVQTLMCYFLLCPHL 
_*VYPSLCKL*CVIFYC. VSTY 
RCILHCANFNVLFSTVFPPT 
14341 - CAAGTTTTGGACCACTAGTAAGAAAAATATTTGTAGATGGTGTTCCTTTTGTTGTTTCAA - 14400 
-QVLDH**EKYL*MVFLLIiFQ 
-KFWTTSKKNICRWCSFCCFN 
SFGPLVRKIFVDGVPFVVST 
14 401 - CTGGATACCATTTTCGTGAGTTAGGAGTCGTACATAATCAGGATGTAAACTTACATAGCT - 14460 
-LDTI F V S * E S Y I I R M * TYIA 
-WIPFS*VRSRT*SGCKLT*L 
GYHFRELGVVHNQDVNLHSS 
14461 - CGCGTCTCAGTTTCAAGGAACTTTTAGTGTATGCTGCTGATCCAGCTATGCATGCAGCTT - 14520 
-RVSVSRNF*CMLLIQLCMQIi 
-ASQFQGTFSVCC*SSYACSF 
RLSFKELLVYAADPAMHAAS 
14521 - CTGGCAATTTATTGCTAGATAAACGCACTACATGCTTTTCAGTAGCTGCACTAACAAACA - 14580 
-LAI YC* INALHAFQ* L H * Q T 
-WQFIAR*THYMLFSSCTNKQ 
GNLLLDKRTTCFSVAALTNN 
14581 - ATGTTGCTTTTCAAACTGTCAAACCCGGTAATTTTAATAAAGACTTTTATGACTTTGCTG - 14640 
-MLLFKLSNPVILIKTFMTLL 
-CCFSNCQTR*F**RLL*LCC 
VAFQTVKPGNFNKDFYDFAV 
14641 - TGTCTAAAGGTTTCTTTAAGGAAGGAAGTTCTGTTGAACTAAAACACTTCTTCTTTGCTC - 14700 
-CLKVSLRKEVLLN*NTSSLL 
-V*RFL*GRKFC*TKTLLLCS 
SKGFFKEGSSVELKHFFFAQ 
14701 - AGGATGGCAACGCTGCTATCAGTGATTATGACTATTATCGTTATAATCTGCCAACAATGT - 14760 
-RMATLLSVIMTIIVIICQQC 
-GWQRCYQ*L*LLSL*SANNV 
DGNAAISDYDYYRYNLPTMC 
14761 - GTGATATCAGACAACTCCTATTCGTAGTTGAAGTTGTTGATAAATACTTTGATTGTrACG - 14820 
-V I S DNSYS *LKLLINTLIVT 
-^YQTTPIRS^SC^^-IL^LLR 
DIRQLLFVVEVVDKYFDCYD 
14821 - ATGGTGGCTGTATTAATGCCAACCARGTAATCGTTAACAATCTGGATAAATCAGCTGGTT - 14880 
-MVAVLMPTK* SLTIWINQLV 
-WWLY*CQPSNR*QSG*ISWF 
GGCINANQVIVNNLDKSAGF 
14881 - TCCCATTTAATAAATGGGGTAAGGCTAGACTTTATTATGACTCAATGAGTTATGAGGATC - 14940 
-SHLINGVRLDFIMTQ*VMRI 

- PI**MG*G*TLL*LNEL*GS 

PFNKWGKARLYYDSMSYEDQ 
14941 - AAGATGCACTTTTCGCGTATACTAAGCGTAAIGTCATCCCTACTATAACTCAAATGAATC - 15000 
-KMHFSRILSVMSSLL*LK*I 
-RCTFRVY*A*CHPYYN5NES 
DALFAYTKRNVIPTITQMNL 
15001 - TTAAGTATGCCATTAGTGCAAAGAAT&GAGCrCGCACCGTAGCTGGTGTCTCTATCTGTA - 15060 
-LSMPLVQRI "ELAP*LVSLSV 

- *VCH*CKE*SSHRSWCLYL* 

KYAISAKNRARTVAGVSICS 
15061 - GTACTATGACAAATAGACAGTTTCATCAGAAATTATTGAAGTCAATAGCCGCCACTAGAG - 15120 
-VL*QIDSFIRNY* S Q * P P L E 
-YYDK*TVSSEIIEVNSRH*R 

TMTNRQFHQKLLKSIAATRG 
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15121 - GAGCTACTGTGGTAATTGGAACAAGCAAGTTTTACGGTGGCTGGCa'TAATATGTTAAAAA - 15180 
-ELLW * LEQASFTVAGI IC*K 

-sycgnwnkqvlrwla* yvkn 

A TVVIGTSKFYGGWHNMLKT 
15181 - CTGTTTACAGTGATGTAGAAACTCCACACCTTATGGGTTGGGATTaTCCAAAATGTGACA - 15240 

- L F T V M * KLHTLWVG I I QNVT 
-CLQ*CRNSTPYGLGLSKM*Q 

VYS DVETPHLMGWDYPKCDR 
15241 - GAGCCATGCCTAACATGC'rTAGGATAA'fGGCCICTCTXGTTCTTGCTCGCAAACATAACA - 15300 
-EECLTCLG*WPLLFL1ANIT 
-SHA*HA A DNGLSCSCSQT*H 
AMPNMLRIMASLVLARKHNT 
15301 - CTTGCTGTMCTTATCACACCGTTTCTACAGGTTAGCTAACGAGTGTGCGCAAGTATTAA - 15360 
-LAVTYHTVSTG*LTSVRKY* 
-LL*LITPFLQVS*RVCASIK 
CCNLSHRFYRLANECAQVLS 
15361 - GTGAGATGGTCATGTGTGGCGGCTCACTATATGTTAAACCAGGTGGAACATCATCCGGTG - 15420 
-VRWSCVAAHYMLNQVEHHPV 
-*DGHVWRLTIC*TRWNIIR* 
EMVMCGGSLYVKPGGTSSGD 
15421 - ATGCTACAACTGCTTATGCTAATAGTGTCTTTAACATTTGTCAAGCTGTTACAGCCAATG - 15480 
-MLQLLMLIVSLT FVKLLQPM 
-CYNCLC**CL*HLSSCYSQC 
ATTAYANSVFNI CQAVTANV 
15481 - TAAATGCACTTCTTTCAACTGATGGTAATAAGATAGCTGACAAGTATGTCCGCAATCTAC - 15540 
-*MHFFQLMVIR*LTSMSAIY 
-KCTSFN*W**DS*QVCPQST 
NALLSTDGNKIADKYVRNLQ 
15541 - AACACAGGCTCTATGAGTGTCTCTATAGAAATAGGGATGTTGATCATGAATTCGTGGATG - 15600 
-NTGSMSVSIEIGMLIMNSWM 
-TQAL*VSL*K*GC*S*IRG* 
HRLYECLYRNRDVDHEFVDE 
15601 - AGXTTTACGCTTACCTGCGTAAACATTTCTCCATGATGATTCTTTCTGATGATGCCGTTG - 15660 
-SFTLTCVNISP* * F F L M M P L 
-VLRLPA*TFLHDDSF* *CRC 
FYAYLRKHFSMMILSDDAVV 
15661 - TGTGCTATAACAGTAACTATGCGGCTCAAGGTTTAGTAGCTAGCATTAAGAACTTTAAGG - 15720 
-CAI TVTMRLKV* * LALRTLR 
-VL- t Q*-IiCGSRFSS*H*EL*G 
CYNSNYAAQGLVASIKNFKA 
1S721 - CAGTTCTTTATTATCAAAATAATGTGTTCATGTCTGAGGCAAAATGTTGGACTGAGACTG - 15780 

- Q V F I IKIMCSCLRQNVGLRL 
-SSLLSK*CVHV*GKMLD*D* 

VLYYQNNVFMSEAKCWTETD 
15781 - ACCTTACTAAAGGACCTCACGAATTTTGCTCACAGCATACAATGCTAGTTAAACAAGGAG - 15840 
-TLLKDLTNFAHSIQC*LNKE 
-PY*RTSRILLTAYNAS*TRR 
LTKGPHEFCSQHTMIVKQGD 
158 41 - ArGATIACGTGTACCTGCCTTACCCAGATCCATCAAGAATATTAGGCGCAGGCTGTTTTG - 15900 
-MITCTCLTQIHQEY*AQAVL 
-*LRVPALPRSIKNIRRRLFC 
DYVYLFYPDPSRILGAGCFV 
15901 - TCGATGATATTGTCAAAACAGArGGTACACTTATGATTGAAAGGTTCGTGTCACTGGCTA - 15960 
-SMILSKQMVHL*LKGSCHWL 
-R*YCQNRWY'i'YD*KVRVTGY 
DDIVKTDGTLMIERFVSLAI 
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15961 - TTGATGCTTRCCCACTTACAAAACATCCTAATCAGGAGTATGCTGATGTCTTTCACTTGT - ] 6020 
-LMLTHLQNILIRSMLMSFTC 
-*CLPTYKTS*SGVC*CIjSLV 
DAYPLTKHPNQEYADVFHLY 

16021 - ATTTACAATACATTAGAAAGTTACATGATGAGCTTACTGGCCACATGTTGGACATGTATT - 16080 
-IYNTLESYMMSLLATCWTCI 
-FTIH*KVT**AYWPHVGHVF 
LQYIRKLHDELTGHMLDMYS 

16081 - CCGTAATGCTAACTAATGATAACACCTCACGGTACTGGGAACCTGAGTTTTATGAGGCTA - 16140 
-P*C*LMITPHGTGNLSFMRL 
-RNAN***HLTVLGT*VL*GY 
VMLTNDNTSRYWEPEFYEAM 

3-5141 - TGTACACACCACATACAGTCTTGCAGGCTGTAGGTGCTTGTGTATTGTGCAATTCACAGA - 16200 
-GTHHIQSCRL*VLVYCAIHR 
-VHTTYSLAGCRCLCIVQFTD 

- YTPHTVLQAVGACVLCNSQT 

16201 - CTTCACTTCGTTGCGGTGCCTGTATTAGGAGACCATTCCTATGTTGCAAGTGCTGCTATG - 16260 
-LHFVAVPVLGDHSYVASAAM 
FTSLR'CLY * ET I PMLQVLL* 
SLRCGACIRRPFLCCKCCYD 
16261 - ACCATGTCATTTCAACATCACACAAATTAGTGTTGTCTGTTAATCCCTATGTTTGCAATG - 16320 
-TMSFQHHTN*"CCLLIPMFAM 
-ECHFNITQISVVC*SLCLQC 
HVISTSHKLVLSVNPYVCNA 
16321 - CCCCAGGTTGTGATGTCACTGATGTGACACAACTGTATCTAGGAGGTATGAGCTATTATT - 16380 
-PQVVMSLM*HNCI*EV*AI I 
-PRL*CH*CDTTVSRRYELLL 
PGCDVTDVTQLYLGGMSYYC 
16381 - GCAAGTCACATAAGCCTCCCATTAGTTTTCCATTATGTGCTAATGGTCAGG'I'TTTTGGIT - 16440 
-ASHISLPLVFHYVLMVRFLV 
~QVT*ASH*FSIMC*WSGFWF 

- KSHKPPISFPLCANGQVFGL 

16441 - TATACAAAAACACATGTGTAGGCAGTGACAATGTCACTGACTTCAATGCGATAGCAACAT - ] SSOO 
-YTKTHV i AVTMSLTSMR*QH 
-IQKHMCRQ*QCH*LQCDSNM 
YKNTCVGSDNVTDFNAIATC 
16501 - GTGATTGGACTAATGCTGGCGATTACATACTTGCCAACACTTGTACTGAGAGACTCAAGC - 16560 
-VIGLMLAITYLPTLVLRDSS 
-*LD*CWRLHTCQHLY*ETQA 
DWTNAGDYILANTCTERLKL 
165 61 - TTTTCGCAGCAGAAACGCTCAAAGCCACTGAGGAAACATTTAAGCTGTCATATGGTATTG - 16620 
-FSQQKRSKPLRKHLSCHMVL 
-FRSRNAQSH*GNI*AVIWYC 
FAAETLKATEETFKLSYGIA 
16621 - CCACTGTACGCGMGTACTCTCTGACAGAGAATTGCATCTTTCATGGGAGGTTGGAAAAC - 16680 
-PLYAKYSLTENCIFHGRLEN 
-HCTRSTL*QRIASFMGGWKT 
TVREVLSDRELHLSWEVGKP 
16681 - CTAGACCACCATTGAACAGAAACTATGTCTTTACTGGTTACCGTGTAACTAAAAATAGTA - 16740 
-LDHH* TETMSLLVTV*LKIV 
-*TTIEQKLCLYWLPCN*K** 
RPPLNRNYVFTGYRVTKNSK 
16741 - AAGTACAGATTGGAGAGTACACCTTTGAAAAAGGTGACTATGGTGATGCTGTTGTGTACA - 16800 
-KYRLESTPLKKVTMVMIiLCT 
-STDWRVHL*KR*LW*CCCVQ 
VQIGEYTFEKGDYGDAVVYR 
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16801 - GAGGTACTACGACATACAAGTTGAATGTTGGTGArTACTTTGTGTTGACATCTCACACTG - 16860 
-EVLRHTS*MLVITLC*HLTL 
-RYYDIQVECW*LLCVDISHC 
GTTTYKLNVGDYFVLTSHTV 
16861 - TAATGCCACTTAGTGCACCTACTCTAGTGCCACAAGAGCACTATGTGAGAATTACTGGCT - 16920 
-*CHLVHLL*CHKSTM*ELLA 
-NAT*CTYSSATRALCENYWL 
MPLSAPTLVPQEHYVRITGL 
16921 - TGTACCCAACACTCAACATCTCAGATGAGTTTTCTAGCAATGTTGCAAATTATCAAAAGG - 16980 
-CTQHSrSQMSFLAMLQI IKR 
-VPNTQHLR*VF J 'QCCKL3KG 
YPTLNISDEFSSNVANYQKV 
16981 - TCGGCATGCAAAAGTACTCTACACTCCAAGGACCACCTGGTACTGGTAAGAGTCATTTTG - 17040 
-SACKSTLHSKDHLVLVRVIL 
-RHAKVLYTPRTTWYW*ESH'C 
GMQKYSTLQGPPGTGKSHFA 
17041 - CCATCGGACTTGCTCTCTATTftCCCATCTGCTCGCATAGTGTATACGGCATGCTCTCATG - 17100 
-PSDLLSITHIiI)A*CIRHALM 
-HRTCSLLPICSHSVYGMLSC 
IGLALYYPSARIVYTACSHA 
17101 - CAGCTGTTGATGCCCTATGTGAAAAGGCATTAAAATATTTGCCCATAGATAAATGTAGTA - 17160 
-QLLMPYVKRH*NICP*INVV 
-SC*CPM*KGIKIS'AHR*M** 
AVDALCEKALKYLPI DKCSR 
17161 - GAATCATACCTGCGCGTGCGCGCGTAGAGTGTTTTGATAAATTCAAAGTGAATTCAACAC - 17220 
-ESY1jRVRA*SVLINSK*IQH 
-NHTCACARRVF**IQSEFNT 
IIPARARVECFDKFKVNSTL 
17 221 - TAGAACAGTATGTTTTCTGCACTGTAAATGCATTGCCAGAAACAACTGCTGACATTGTAG - 17280 
- * 11 SMFSAL*MHCQKQLLTL* 
-RTVCFLHCKCIARNNC*HCS 
EQYVFCTVNALPETTADIVV 
17281 - TCTTTGATGAAATCTCTATGGCTACTAATTATGACTTGAGTGTTGTCAATGCTAGACTTC - 17340 
-SLC4KSLWLLIMT*VLSMLDF 
-L**NLYGY*L*LECCQC*TS 
FDEISMATNYDLSVVNARLR 
17341 - GTGCAAAACACTACGTCTATATTGGCGATCCTGCTCAATTACCAGCCCCCCGCACATTGC - 17400 
-VQNTTSILAILLNYQPPAHC 
-CKTLRLYWRSCSITSPPHIA 
AKHYVYIGDPAQLPAPRTLL 
17401 - TGACTAAAGGCACACTAGAACCAGAATATTTTAATTCAGTGTGCAGACTTATGAAAACAA - 17460 
-*LKAH*NQNILIQCADL*KQ 
-D*RHTRTRIF*FSVQTYENN 
TKGTLEPEYFNSVCRLMKTI 
17461 - TAGGTCCAGACATGTTCCTTGGAACTTGTCGCCGTTGTCCTGCTGAAATTGTTGACACTG - 17520 
-*VQTCSLELVAVVLLKLLTL 
-RSRHVPWNLSPLSC*NC*HC 
GPDMFLGTCRRCPAEIVDTV 
17521 - TGAGTGCTTTAGTTTATGACAATAAGCTAAAAGCACACAAGGATAAGTCAGCTCAATGCT - 17580 
-*VL*FMTIS*KHTRISQLNA 
-ECFSL*Q*AKSTQG*VSSML 
SALVYDNKLKAHKDKSAQCF 
17581 - TCAAAATGTTCTACAAAGGTGTTATTACACATGATGTTTCATCTGCAATCAACAGACCTC - 17640 
-SKCSTKVLLHMMFHLQSTDL 
-QNVLQRCYYT*CFICNQQTS 
KMFYKGVITHDVSSAINRPQ 



FIG. 11 Con't 



WO 2004/085650 



PCT/CN2004/000246 



40/106 

17 641 - AAATAGGCGTTGTAAGAGAATrTCTTACACGCAATCCTGCTTGGAGAAAAGCTGTTTTTA - 17700 

-K*AL*BNFLHAILLGEKLFL 
-NRRCKRISYTQ 3CL EK3CFY 
I C-VVREFLTRNPAWRKAVFI 
17701 - TCTCACCTTATAATTCACAGAACGCTGTAGCTTCAAAi\ATCT?AGGAT?GCCTACGCAGA - 17760 
-SHLIIHRTL*LQKS*DCLRR 
-LTL*FTERCSFKNLRIAYAD 
SPYNSQNAVASKIIiGLPTQT 
17761 - CTGTTGATTCATCACAGGGTTCTGAATATGACTATGTCATATTCACACAAACTACTGAAA - 17820 
-LLIHHRVLNMTMSYSHKLLK 
-C*FITGF*I*LCHIHTNY*N 
VDSSQGSEYDYVIFTQTTET 
17821 - CAGCACACTCTTGTAATGTCAACCGCTTCAATGTGGCTATCACAAGGGCAAAAATTGGCA - 178B0 
-QHTLVMSTASMWLSQGQKLA 
-STLL*CQPLQCGYHKGKNWH 
AHSCNVNRFNVAITRAKIGI 
17881 - TTTTGTGCATAATGTCTGATAGAGATCTTTATGACAAACTGCAATTTACAAGTCTAGAAA - 17940 
-FCA*CLIEIFMTNCNLQV*K 
-FVHNV**RSL*QTAIYKSRN 
1CIMSDRDLYDKLQFTSLEI 
17941 - TACCACGTCGCAATGTGGCTACATTACAAGCAGAAAATGTAACTGGACTTTTTAAGGACT - 18000 
-YH VAMWLHYKQKM * LDFLRT 
-TTSQCGYITSRKCNWTF*GL 
PRRNVATLQAENVTGLFKDC 

18 001 - GTAGTAAGATCATTACTGGTCTTCATCCTACACAGGCACCTACACACCTCAGCGTTGATA - IB 050 

-VVR3LLVFILHRHLHXSALI 
-**DHYWSSSYTGTYTPQR*Y 
SKIITGLHPTQAPTHLSVDI 
18 061 - TAAAATTCAAGACTGAAGGATTATGTGTTGACATACCAGGCATACCAAAGGACATGACCT - 18120 
-*NSRLKDYVLTYQAYQRT*P 
-KIQD*RIMC*HTRHTKGHDL 
KFKTEGLCVDIPGIPKDMTY 
18121 - ACCGTAGACTCATCTCTATGATGGGTTTCAAAATGAATTACCAAGTCAATGGTTACCCTA - 18180 
-TVDSSL*WVSK*ITKSMVTL 
-P*THLYDGFQNELPSQWLP* 
RRLISMMGFKMNYQVNGYPN 
18181 - ATATGTTTATCACCCGCGAAGAAGCTATTCGTCACGTTCGTGCGTGGATTGGCTTTGATG - 1B240 
-ICLSPAKKtFVTFVRGLALM 
-YVYHPRRSYSSRSCVDWL*C 
MFITREEAIRHVRAWIGFDV 
18241 - TAGAGGGCTGTCATGCAACTAGAGATGCTGTGGGTACTAACCTACCTCTCCAGCTAGGAT - 18300 
-' t RAVMQLEMLWVJjTYI)SS*D 
-RGLSCN*RCCGY*PTSPARI 
EGCHATRDAVGTNLPLQLGF 
18301 - TTTCTACAGGTGTrAACTTAGTAGCTGTACGGACTGGTTATGTTGACACTGAAAATAACA - 18360 
-FLQVLT* *LYRLVMLTLKIT 
-FYRC*LSSCTDWLC*H*K*H 
STGVNLVAVPTGYVDTENNT 
18361 - CAGAATTCACCAGAGTTAATGCAAAACCTCCACCAGGTGACCAGTTTAAACATCTTATAC - 18420 
-QNSPELMQNLHQVTSLNILY 
-RIHQS*CKTSTR*PV*TSYT 
EFTRVNAKPPPGDQFKHLIP 
18421 - CACICATGTATAAAGGCTTGCCCTGGAATGTAGTGCGTATTAAGATAGTACAAATGCTCA - 18480 
-HSCIKACPGM*CVLR*YKCS 
-THV*RLALECSAY*DS TNAQ 
LMYKGLPWNVVRIKIVQMLS 
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18481 - GTGATACACTGAAAGGATTGTCAGACAGAGTCGTGOTCGTCCTTTGGGCGCATGGCTTTG - 18540 
-VI H * KDCQTESC3 S FGRMAL 
_*YTERIVRQSRVRPLGAWL* 
DTLKGLSDRVVFVLWAKGFE 

18541 - AGCTTACATCAATGAAGTACTTTGTCAAGATTGGACCTGAAAGAACGTGTTGTCTGTGTG - 18600 

- S L K Q * STLSRLDLKERVVCV 
-AYINEVLCQDWT* KNVLSV* 

LTSMKYFVKIGPERTCCLCD 
18601 - ACAAACGTGCAACTTGCTTTTCTACTTCATCAGATACTTATGCCTGCTGGAATCATTCTG - 18660 
-TNVQLAFLLHQILMPAGIIL 
-QTCNLLFYFIRYLCLLESFC 
KRATCFSTSSDTYACWNHSV 
18661 - TGGGTTTTGACTATGTCTATAACCCATTTATGATTGATGTTCAGCAGTGGGGCTTTACGG - 18720 
-WVLTMSITHIi*LMFSSGALR 
-GF*LCL*PIYD*CSAVGLYG 
GFDYVYNPFMIDVQQWGFTG 
18721 - GTMCCTTCAGAGTAACCATGACCAACATTGCCAGGTACATGGAAATGCACATGTGGCTA - 18780 
-VTFRVTMTNIARYMEMHMWL 
-*PSE*P*PTLBGTWKCTCG* 
NLQSNHDQHCQVHGNAHVAS 
18781 - GTTGTGATGCTATCATGACTAGATGTTTAGCAGTCCATGAGTGCTTTGTTAAGCGCGTTG - 18840 
-VVMLS *LDV*QSMSALLSAL 
-L*CYHD*MFSSP*VLC*AR* 
CDAIMTRCLAVHECFVKRVD 
18841 - ATTGGTCTGTTGAATACCCTATTATAGGAGATGAACTGAGGGTTAATTCTGCITGCAGAA - 18900 
-IGLLNTLL* EMM * G I> I L L A E 
-LVC* I PYYRR* TEG* FCLQK 
WSVEYP1 IGDELRVNSACRK 
18901 - AAGTACAACACATGGTTGTGAAGTCTGCATTGCTTGCTGATAAGTTTCCAGTTCTTCATG - 18960 
-KYNTWL*SLHCLLISFQFFM 
-STTHGCEVCIAC**VSSSS* 
VQHMVVKSALLADKFPVLHD 
18961 - ACATTGGAAATCCAAAGGCTATCAAGTGTGTGCCTCAGGCTGAAGTAGAATGGAAGTTCT - 19020 
-TIiEIQRLSSVCLRLK*NGSS 
-HWKSKGYQVCASG* SRMEVL 
IGNPKAIKCVPQAEVEWKFY 
19021 - ACGATGCTCAGCCATGTAGTGACAAAGCTTACAAAATAGAGGAACTCTTCTATTCTTATG - 19080 
-TMLSHVVTKLTK*RNSSILM 
-RCSAM* *'QSLQNRGTLLFIiC 
DAQPCSDKAYKIEELFYSYA 
19081 - CTACACATCACGATAAATTCACTGATGG1GTTTGTTTGTTTTGGAATTGTAACGTTGATC - 19140 
-LHITINSLMVFVCFGIVTLI 

- Y X S R * IH*WCLFVLEL*R*S 

THHDKFTDGVCLFWNCNVDR 
19141 - GTTACCCAGCCAATGCAATTGTGTGTAGGTTTGACACAAGAGTCTTGTCAAACTTGAACT - 19200 
-VTQPMQLCVGLTQESCQT*T 
-LPSQCNCV*V*HKSLVKLEL 
YPANAIVCRFDTRVLSNLNL 
19201 - TACGAGGCTGTGATGGTGGTAGTTTGTATGTGAATAAGCATGCATTCCACACTCCAGC1T - 19260 
-YQAVMVVVCM*ISMHSTLQL 
-TRL*WW*FVCE*ACIPHSSF 
PGCDGGSLYVNKHAFHTPAF 
19261 - rCGATAAAAGTGCATTTACrAATTTAAAGCAATTGCCTTTCTTTTACTATTCTGATAGTC - 19320 
-SIKVHLLI*SNCLSFTILIV 
-R*KCIY*FKAIAFLLLF**S 
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19321 - CTTGTGAGTCTCATGGCAAACAAGTAGTGTCGGATATTGATTATGTTCCACTCAAATCTG - 19380 
-LVSLMANK* CRI LIMFH SNL 
-L*VSWQTSSVGY*LCSTQIC 
CESHGKQVVSDI DYVPLKSA 

19381 - CTACGTGTATTACACGATGCAATTTAGGTGGTGCTGTTTGCAGACACCATGCAAATGAGT - 19440 
-LRVLHDAI*VVLFADTMQMS 

- YVYYTMQFRWCCLQTPCK*V 

TCITR CNLGGAVCRHHANEY 
19441 - ACCGACAGTACTTGGATGCATATAATATGATGATTTCTGCTGGATTTAGCCTATGGATTT - 19500 
-TDSTWMH I 1 ♦ *FLLDLAYGF 
-PTVLGCI*YDDFCWI*PMDL 
RQYIiDAYNMMISAGFSLWIY 
19501 - ACAAACAATTTGATACTTATAACCTGTGGAATACATTTACCAGGTTACAGAGTTTAGAAA - 19560 
-TNNLILITCGIHLPGYRV*K 

- Q T I *YL* PVEYIYQVTEFRK 

KQFDTYNLWNTFTRLQSLEN 
19561 - ATGTGGCTTATAATGTTGTTAATAAAGGACACTTTGATGGACACGCCGGCGAAGCACCTG - 19620 
-MWLIMLLIKDTLMDTPAKHL 
-CGL*CC J '' I 'RTL J 'WTRRRSTC 
VAYNVVNKGHFDGHAGEAPV 
19621 - TTTCCATCATTAATAATGCTGTTTACACAAAGGTAGATGGTATTGATGTGGAGATCTTTG - 19680 
-FPSLIMLFTQR*MVLMWRSL 
-FHH**CCLHKGRWY*CGDL* 
SIINNAVYTKVDGI DVEIFE 
19681 - AAAATAAGACAACACTTCCTGTTAATGTTGCATTTGAGCTTTGGGCTAAGCGTAACATTA - 19740 
-KIRQHFLLMLHLSFGLSVTL 
-K*DNTSC*CCI*ALG*A*H* 
NKTTLPVNVAFELWAKRNIK 
19741 - AACCAGTGCCAGAGATTAAGATACTCAATAATTTGGGTGTTGATATCGCTGCTAATACTG - 19800 
-NQCQRLRYS IIWVLISLLIL 

- T S A R D * DTQ*FGC*YRC*YC 

PVPEIKILNNLGVDIAANTV 
19801 - TAATCTGGGACTACAAAAGAGAAGCCCCAGCACATGTATCTACAATAGGIGTCTGCACAA - 19860 
-*SGTTKEKPQHMYLQ*VSAQ 
-NLGLQKRSPSTCIYNRCLHN 
IWDYKREAPAHVSTIGVCTM 
19861 - TGACTGACATTGCCAAGAAACC7ACTGAGAGTGCTTGTTCTTCACTTACTGTCTTGTTTG - 19920 
-*LTI)PRNLLRVLVLHLLSCL 
-D*HCQETY*ECLFFTYCLV* 
TDIAKKPTESACSSLTVLFD 
19921 - ATGGTAGAGTGGAAGGACAGGTAGACCTTTTTAGAAACGCCCGTAATGGTGTTTTAATAA - 19980 
-MVEWKDR*TFI>ETPVMVF* * 
-W*SGRTGRPF*KRP*WCFNN 
GRVEGQVDLFRNARNGVLIT 
19981 - CAGAAGGTTCAGTCAAAGGTCTAACACCTTCAAAGGGACCAGCACAAGCTAGCGTCAATG - 2004 0 
-QKVQSKV*HLQRDQHKLASM 
-RRFSQRSNTFKGTSTS*RQW 
EGSVKGLTPSKGPAQASVNG 
20041 - GAGTCACATTAATTGGAGAATCAGTAAAAACACAGTTTAACTACTTTAAGAAAGTAGACG - 20100 
-ESH*LENQ*KHSLTTLRK*T 
-SHINWRISKNTV*LIi*ESRR 
VTLIGESVKTQFNYFKKVDG 
20101 - GCATTATTCAACAGTTGCCTGRAACCTACTTTACTCAGAGCAGAGACTTAGAGGATTTTB - 20160 
-ALFNSCLKPTLLRAET*RIL 
-HYSTVA*NLLYSEQRLRGF* 
I IQQLPETYFTQSRDLEDFK 
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20161 - AGCCCAGATCACAAATGGAAACTGACTTTCTCGAGCTCGCTATGGATGAATTCATACAGC - 20220 
-SPDHKWKLTFSSSLWMNSYS 
-AQITNGN* LSRARYG* IHTA 
PRSQMETDFLELAMDEFIQR 
20221 - GATATAAGCTCGAGGGCTATGCCTTCGAACACATCGTTTATGGAGATTTCAGTCATGGAC - 20280 
-DISSRAMPSNTSFMEISVMD 
-I *ARGLCLRTHRLWRFQSWT 
YKLEGYAFEHIVYGDFSHGQ 
20281 - AACTTGGCGGTCTTCATTTAATGATAGGCTTAGCCAAGCGCTCACAAGATTCACCACTTA - 20340 
-NLAVFI***A*PSAHKIHHL 
-TWRSSFNDRkSQALTRFTT* 
LGGLHLMIGLAKRSQDSPLK 
20341 - AATIAGAGGATTTTATCCCTATGGACAGCACAGTGAAAAATTACTTCATAACAGATGCGC - 204 00 
-N*RILSLWTAQ*KITS*QMR 

- IRGFYPVGQHSEKLLHNRCA 

LEDFIPMDST VKNYFITDAQ 
20401 - AAACAGGTTCATCAAAATGTGTGTGTTCTGO'GATTGATCTTITACTTGATGACTTTGTCG - 204 60 
-KQVHQNVCVL*LIFYLMTLS 
-NRFIKMCVFCD*SFT**LCR 
TGSSKCVCSVIDLLLDDFVE 
20461 - AGATAATAAAGTCACAAGATTTGTCAGTGATTTCAAAAGTGGTCAAGGTTACAATTGACT - 20520 

- R * * SHKICQ* FQKWSRLQ1T 
-DNKVTRFVSDFKSGQGYN*L 

IIKSQDLSVISKVVKVTIDY 
20521 - ATGCTGAAATTTCATTCATGCTTTGGTGTAAGGATGGACATGTTGAAACCTTCTACCCAA - 20580 
-MLKFHSCFGVRMDMLKP STQ 
-C*NFIHALV*GWTC*NLLPK 
AEISFMLWCKDGHVETFYPK 
20581 - AACTAGAAGCAAGTCAAGCGTGGCAACCAGGTGTTGCGATGCCTAACTTGTACAAGATGC - 20640 
-NYKQVKRGNQVLRCLTCTRC 
-TTSKSSVATRCCDA*LVQDA 
LQASQAWQPGVAMPNLYKMQ 
20641 - AAAGAATGCTTCTTGAAAAGTGTGACCTTCAGAATTATGGTGAAAATGCTGTTATACCAA - 20700 
-KECFLKSVTFRIMVKMLLYQ 
-KNAS*KV*PSELW*KCCYTK 
RMLLEKCDL QNYGENA.VIPK 
20701 - AAGGAATAATGATGAATGTCGCAAAGTATACTCAACTGTGTCAATACTTAAATACACTTA - 20760 
-KE***MSQSILNCVMT*IHL 
-RNNDECRKVYSTVSILKYTY 
GIMMNVAKYTQLCQYLNTLT 
20761 - CTTTAGCTGTACCCTAGAACATGAGAGTTATTCACTTTGGTGCTGGCTCTGATAAAGGAG - 20820 
-L + LYPTT*ELFTLVLALIKE 
-FSCTLQHES¥SLWCWL**RS 
LAVPYNMRVIHFGAGSDKGV 
20821 - TTGCACCAGGTACAGCTGTGCTCAGACAATGGTTGCCAACTGGCACACTACTTGTCGATT - 20880 
-LHQVQLC S DNGCQLAHYLS I 
-CTRYSCAQTMVANWHTTCRF 

- APGTAVLRQWLPTGTLLVDS 

20881 - CAGATCTTAATGACTTCGTCTCCGACGCAGATTCTACTTTAATTGGAGACTGTGCAACAG - 20940 
-QILMTSSPTQILL*LETVQQ 
-RS* * LRLRRRFYFNWRLCNS 

DLNDFVSDADSTLIGDCATV' 

20941 - TACATACGGCTAATAAATGGGACCTTATTATTAGCGATATGTATGACCCTAGGACCAAAC - 21000 
-YIRI. INGTLLLAICMTLGPN 
-TYG**MGPYY*RYV*P*DQT 

- HTANKWDLIISDMYDPRTKH 
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21001 - ATGTGACAAAAGAGAATGACTCTAAAGAAGGGTTTTTCACTTATCTGTGTGGATTTATftA - 21060 
-M*QKRMTLKKGFSLICVDL* 
-CDKRE*L*RRVFHLSVWIYK 
VT KENDSKEGFFTYLCGFIK 
21061 - AGCAAAAACTAGCCCTGGGTGGTTCTATAGCTGTAAAGATAACAGAGCATTCTTGGAATG - 21120 
- S K H * P Vf V V L * L * R * Q S I L G M 
-AKTSPGWFYSCKDNRAFLEC 
QKLALGGS IAVKI TEHSWNA 
21121 - CTGACCTTTACAAGCTTATGGGCCATTTCTCATGGTGGACAGCTTTTGTrACAAATGTAA - 21180 
-LTFTSLWAISHGGQLLLQM* 
-*PLQAYGPFLMVDSFCYKCK 
DLYKLMGHFSWWTAFVTNVN 
21181 - ATGCATCATCATCGGAAGCATTTTTAATTGGGGCTAACTATCTTGGCAAGCCGAAGGAAC - 21240 
-MHHHRKHF*LGLTILASRRN 
-CIIIGSIFNWG*LSWQAEGT 
ASSSEAFLIGANYLGKPKEQ 
21241 - AAATTGATGGCTATACCATGCATGCTAACTACATTTTCTGGAGGAACACAAATCCTATCC - 21300 
-KLMA I PCMLTTFSGGTQI LS 
-N*WLYHAC*LHFLEEHKSYP 
I D'GYTMHANYI FWRNTNPIQ 
21301 - AGTTGTCTTCGTATTCACTCTTTGACATGAGCAAATrTCCTCTTAAATTAAGAGGAACTG - 21360 
-SCLPIHSLT*ANFLLN*EEL 
-VVFLFTL*HEQISS*IKRNC 
LSSYSLFDMSKFPLKLRGTA 
21361 - CTGTAATGTCTCTTAAGGAGAATCAAATCAATGATATGATTTATTCTCTTCTGGAAAAAG - 21420 
-L* CLLRRIKSMI * FILFWKK 
-CNVS*GESNQ*YDLFSSGKR 
VMSLKENQINDMIYSLLEKG 
21421 - GTAGGCTTATCATTAGAGAAAACAACAGAGTTGTGGrTTCAAGTGATATTCTTGTTAACA - 21480 
-VGLSLEKTTEIjWFQVIFLLT 
-*AYH*RKQQSCGFK*YSC*Q 
R L I IRENNRVVVSSDILVNN 
21431 - ACTAAACGAACATGTTTATTTTCTTATTATTTCTTACTCTCACTAGTGGTAGTGACCTTG - 21540 
-TKRTCLFSYYFLLSLVVVTL 
-LNEHVYFLIISYSH*W**P* 
* TNMFIFLLFLTLTSGSDIiD 
21541 - ACCGGTGCACCRCTTTTGATGATGTTCAAGCTCCTAATTACACTCAACATACTTCATCTA - 21600 
-TGAPLLMMFKLLITLNILHL 
-PVHHF**CSSS*LHSTYFIY 
RCTIFDDVQAPNYTQHTSSM 
21601 - TGAGGGGGGTTTACTATCCTGATGAAATTTTTAGATCAGACACTCTTTATTTAACTCAGG - 21660 
-*GGFTILMKFLDQTLFI*LR 
-EGGLLS**NF* IRHSLFNSG 
RGVYYPDEIFRS DTLYLTQD 
21661 - ATTTATTTCTTCCATTTTATTCTAATGTTACAGGGTTTCATACTATTAATCATACGTTTG - 21720 
-IYFFHFILMLQGFILLI IRL 
-FISSILF' k CYRVSYY*SYVW 
LFLPFYSNVTGFHTINHTFG 
21721 - GCAACCCTGTCATACCTTTTAAGGATGGTATTTATTTTGCTGCCACAGAGAAATCAAATG - 21780 
-AT LS YLLRMVFI LL PQRNQM 
-QPCHTF*GWYLFCCHREIKC 
NPVIPFKDGIYFAATEKSNV 
21781 - TTGTCCGTGGTTGGGTTTTTGGTTCTACCATGAACAACAAGTCACAGTCGGTGATTATTA - 21840 
-LSVVGFLVLP*TTSHSR*LL 
-CPWLGFWFYHEQQVTVGDYY 
V R G W V F G S T M N N K S Q S V I I I 
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21841 - TTAACAATTCTACTftATGTTGTTATACGAGCATGTflACTTTGAATTGTGTGACAACCCTT - 21900 

- L T I LLMLLYEHVTLNCVTTL 
-*QFY*CCYTSM*L*IV*QPF 

NNST NVVIRACNFELCDNPF 
21901 - TCTTTGCTGTTTCTAAACCCATGGGTACACAGACACATACTATGATATTCGATAATGCAT - 21950 
-SLLFLNPWVHRHIL*YSIMH 

- L C C F * T H G Y T D T Y Y D I R * C I 

FAVSKPMGTQTHTMIFDNAF 
21961 - TTAATTGCACTTTCGAGTACATATCTGATGCC'TTTTCGCTTGATGTTTCAGAAAAGTCAG - 22020 
-LIALSSTVLMPFRLMFQKSQ 

- * L H F R V H I * C L F A * C F R K V R 

NCTFEYISDAFSLDVSEKSG 
22021 - GTAATTTTAAACACTTACGAGAGTTTGTGTTTAAAAATAAAGATGGGTTTCTCTATGTTT - 22080 
-VILNTYESLCLKIKMGFSMF 
-*F*TLTRVCV*K*RWVSLCL 
NFKHLREFVFKNKDGFLYVY 
22081 - ATAAGGGCTATCAACCTATAGATGTAGTTCGTGATCTACCTTCTGGTTTTAACACTTTGA - 2214 0 
-IRAINL*M*FVIYLLVLTL* 
-*GLSTYRCSS'*STFWF*HFE 

- KGYQPIDVVRDLPSGFNTLK 

22141 - AACCTATTTTTAAGTTGCCTCTTGGTATTAACATTACAAATTTTAGAGCCATTCTTACAG - 22200 
-NLFLSCLLVLTLQILEPFLQ 
-TYF*VASWY*HYKF*SHSYS 
PIFKLPLGINITNFRAILTA 
22201 - CCTTTTCACCTGCTCAAGACATTTGGGGCACGTCAGCTGCAGCCTATTTTGTTGGCTATT - 22260 
-PFHLLKTFGARQLQPILLAI 
-LFTCSRHLGHVSCSLFCWLF 
FS PAQDIWGTSAAAYFVGYL 
22261 - TAAAGCCAACTACATTTATGCTCAAGTATGATGAAAATGGTACAATCACAGATGCTGTTG - 22320 

- * SQLHLCSSMMKMVQSQMLL 
-KANY IYAQV**KWYNHRCC J ' 

KPTTFMLKYDENGTI TDAVD 
22321 - ATTGTTCTCAAAATCCACTTGCTGAACTCAAATGCTCTGTTAAGAGCTTTGAGATTGACA - 22380 
-IVLKIHLLNSNALLRALRLT 
-LFSKSTC*TQMLC*EL*D*Q 
CSQNPLAELKCSVKSFEIDK 
22381 - AAGGAATTTACCAGACCTCTAATTTCAGGGTTGTTCCCTCAGGAGATGTTGTGAGATTCC - 22440 
-KE FTRPLISGLFPQEML* DS 
-RNLPDL*FQGCSLRRCCEIP 
GIYQTSNFRVVPSGDVVRFP 
22441 - CTAATATTACAAACTTGTGTCCTTTTGGAGAGGTTTTTAATGCTACTAAATTCCCTTCTG - 22500 
-LI LQTCVLLERFLMLLNSLL 
-*YYKLVSFWRGF*CY*IPFC 
NITNLCPFGEVFNATKFPSV 
22501 - TCTATGCATGGGAGAGAAAAAAAATTTCTAATrGTGTTGCTGATTACTCTGXGCTCTACA - 22560 
-SMHGREKKFLIVLLITLCST 
-LCMGEKKNF*LCC*LLCALQ 
YAWERKKISNCVADYSVLYN 
22561 - ACTCAACATTTTTTTCAACCTTTAAGTGCTATGGCGTTTCTGCCACTAAGTTGAATGATC - 22620 
-TQHFFQPLSAMAFLPLS*MI 
-LNIFFNL*VLWRFCH*VE*S 
STFFSTFKCYGVSATKLNDL 
22 621 - TTTGCTTCTCCAATGTCTATGCAGATTCTTTTGTAGTCAAGGGAGATGATGTAAGACAAA - 22680 
-FASPMSMQILL* S R E M M * D K 
-LLLQCLCRFFCSQGR*CKTN 
CFSNVYADSFVVKGDDVRQI 
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22681 - TAGCGCCAGGACAAAGTGGTGTTATTGCTGATTATAATTATAAATXGCCAGATGATTTCA - 22740 
-*RQDKLVLLLIIIINCQMIS 
-SARTNWCYC*I>*L* I A R * FH 
APGQTGVIADYNYKLPDDFM 
22741 - TGGGTTGTGTCCTTGCTTGGAATACTAGGAACATTGATGCTACTTCAACTGGTAATTATA - 22800 
-WVVSLLGILGTLMLLQLVI I 
-GLCPC1EY*EH*CYFNW*L* 
GCVLAWNTRNIDATSTGNYN 
22 801 - ATrATAAATATAGGTATCTTAGACATGGCAAGCTTAGGCCCTTTGAGAGAGACATATCTA - 22860 
-IINIGILDMASLGPLRETYL 
-L*I*VS*TWQA.*AL*ERHI* 
YKYRYLRHGKLRPFERDISN 
22861 - ATGTGCCTTTCTCCCCTGATGGCAAACCTTGCACCCCACCTGCTCTTAATTGTTATTGGC - 22920 
-MCLSPLMANLAPHLLLIVIG 
-CAFLP*WQTLHPaCS*IiLIiA 
VPFSPDGKPCTPPALNCYWP 
22921 - CATTAAATGATTATGGTTTTTACACCACTACTGGCATTGGCTACCAACCTTACAGAGTTG - 22980 
-H*MIMVFTPLLALATNLTEL 
-IK*LWFLHHYWHWLPTLQSC 

- LNDYGFYTTTGIGYQPYRVV 

22981 - TAGTACTTTCTTTTGAACTTTTAAATGCACCGGCCACGGTTTGTGGACCAAAATTATCCA - 23040 
-*YFLLNF*MHRPRFVDQNYP 
-STFF*TFKCTGHGLWTKI IH 
VLSFELLNAPATVCGPKLST 

23041 - CTGACCTTATTAAGAACCAGTGTGTCAATTTTAATTTTAATGGACTCACTGGTACTGGTG - 23100 
-LTLLRTSVS ILI LMDS LVLV 
_*PY*EPVCQF*F*WTHWYWC 
DLIKNQCVNFNFNGLTGTGV 

23101 - TGTTAACTCCTTCTTCAAAGAGATTTCAACCATTTCAACAATTTGGCCGTGATGTTTCTG - 23160 
-C*LLLQRDFNHFNNLAVMFL 

- VNSFFKEISTISTIWP*CF* 

LTPSSKRFQPFQQFGRDVSD 
23161 - ATTTCACTGATTCCGTTCGAGATCCTAAAACATCTGAAATATTAGACATTTCACCTTGCT - 23220 
-ISLIPFEILKHLKY'TFHLA 
-FH*FRSRS*NI*NIRHFTLL 
FTDSVRDPKT3EILDISPCS 
23221 - CTTTrGGGGGTGTAAGTGTAATTACACCTGGAACAAATGCTTCATCTGAAGTTGCTGTTC - 23280 
-LLGV 'V* LHLEQMLHLKLLF 
-FWGCKCNYTWNKCFI *SCCE 
FGGVSVITPGTNASSEVAVL 
23281 - rATATCAAGATGTTAACTGCACTGATGTTTCTACAGCAATTCATGCAGATCAACTCACAC - 23340 
-YIKMLTALMFLQQFMQINSH 
-ISRC*LH*CFYSNSCRSTHT 
YQDVNCTDVSTAIHADQLTP 
23341 - CAGCTTGGCGCATATATTCTACTGGAAACAAT3TATTCCAGACTCAAGCAGGCTGTCTTA - 23400 
-QLGAYILLETMYSRLKQAVL 
-SLAHIFYWKQCIPDSSRLSY 
AWRIYSTGMNVFQTQAGCLI 
23401 - TAGGAGCTGAGCATGTCGACACTTCTTATGAGTGCGACATTCCTATTGGAGCTGGCATTT - 23460 
-*ELSMSTLI)MSATFLLELAF 
-RS*ACRHFL*VRHSYWSWHL 
GAEHVDTSYECDIPIGAGIC 
23461 - GTGCTAGTTACCATACAGTTTCTTTATTACGTAGTACTAGCCAAAAATCTATTGTGGCTT - 23520 
-VLVT I Q FLYYVVLA.KNLLW L 
-C*LPYSFFIT*Y*PKIYCGL 

- A S Y H T V S I. L R S T S Q K S I V A Y 
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23521 - ATACTATGTCTTTAGGTGCTGATAGTTCAATTGCTTACTCTAATAACACCATTGCTATAC - 23580 
-ILCL*VLIVQLLTLITPL1Y 
-YYVFRC**FNCLL**HHCYT 
TM S LGAD 'S S IAYSNN T I AI P 
23581 - CTACTftACTTTTCARTTAGCATTACTACAGAAGTAATGCCTGTTTCTATGGCTAAAACC'r - 2 364 0 
-LLTFQLALLQK*CLFLWLKP 
-Y*LFN i HYYRSNACFYG*NL 
TNFSISITTEVMPVSMAKTS 
23641 - CCGTAGATTGTAATATGTACATCTGCGGAGATTCTACTGAATGTGCTAATTTGCTTCTCC - 237 00 
-P*IVICTSAEILLNVLICFS 
-RRL*YVHLRRFY*MC* FASP 
VDCNMYICGDSTECANLLLQ 
23701 - AATATGGTAGCTTTTGCACACAAC1AAATCGTGCACTCTCAGGTATTGCTGCTGAACAGG - 23760 
-NMVAFAH N * I V H SQVLLLNR 
-IW*LLHTTKSCTLRYCC*TG 
YGS FCTQLNRALSGIAAEQD 
237 61 - ATCGCAACACACGTGAAGTGTTCGCTCAAGTCAAACAAATGTACAAAACCCCAACTTTGA - 23820 
-IATHVKCSLKSNKCTKPQL* 
-SQHT*SVRSSQTNVQNPNFE 
RNTREVFAQVKQMYKTPTLK 
23821 - AATATTTTGGTGGTTTTAATTTTTCACAAATATTACCTGACCCTCTAAAGCCAACTAAGA - 23880 
-NILVVLIFHKYYLTL*SQLR 
-IFWWF*FFTNIT*PSKAN*E 
YFGGFNFSQILPDPLKPTKR 
23881 - GGTCTTTTATTGAGGACTTGCTCTTTAATAAGGTGACACTCGCTGATGCTGGCTTCATGA - 23940 
-GLLLRTCSLIR*HSLMLAS* 
-VFY*GIiAL**GDTR*CWLHE 
SFIEDLLFNKVTLADAGFMK 
23941 - AGCAATATGGCGAATGCCTAGGTGATATTAATGCTAGAGATCTCATTTGTGCGCAGAAGT - 24000 
-SNMANA*VILMIiEISFVRRS 
-AIWRMPR*Y*C*RSHLCAEV 
QYGECZiGDINARDLICAQKF 
24001 - TCAATGGACTTACAGTGTTGCCP.CCTCTGCTCACTGATGATATGATTGCTGCCTACACTG - 24060 
-SMDLQCCHLCSLMI * LLPTL 
-QWTYSVATSAH**YDCCLHC 
NGLTVLPPLLTDDMIAAYTA 
24061 - CTGCTCTAGITAGTGGTACTGCCACTGCTGGATGGACATTTGGTGCTGGCGCTGCTCTTC - 24120 
-LL*LVVLPLLDGHLVLALLF 
-CSS*WYCHCWMDIWCWRCSS 
ALVSGTATAGWTFGAGAALQ 
24121 - AAATACCTTTTGCTATGCAAATGGCATATAGGTTCAATGGCATTGGAGTTACCCAAAATG - 241B0 
-KYLLLCKWHIGSMALELPKM 
-NTFCYANGI*VQWHWSYPKC 
IPFAMQMAYRFNGIGVTQNV 
24181 - TTCTCTATGAGAACCAAAAACAAATCGCCAACCAATTTAACAAGGCGATTAGTCAAATTC - 24240 
-FSMRTKNKSPTNLTRRLVKF 
-SL*EPKTNRQPI*QGD*SNS 
LYENQKQIANQPNKAISQIQ 
24241 - AAGAATCACTTACAACAACATCAACTGCATTGGGCAAGCTGCAAGACGTTGTTAACCAGA - 24300 
-KNHLQQHQLHWASCKTLLTR 
- RITYNNINCIGQAARRC* PE 
ESLTTTSTALGKLQDVVNQN 
24301 - ATGCTCAAGCATTAAACACACTTGTTAAACAACTTAGCTCTAATTTTGGTGCAATTTCAA - 24360 
-MLKH*THLLNNIiALIIiVQFQ 
-CSSIKHTC*TT*L*FWCNFK 
AQALNTLVKQLSSNFGAISS 
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24361 - GTGTGCTAAATGATATCCTTTCGCGACTrGATAAAGTCGAGGCGGAGGTACAAATTGACA - 24420 
-VC*MISFRDLIKSRRRYKLT 
-CAK*YPFAT**SRGGGTN*Q 
VLNDILSRLDKVEAEVQIDR 
24421 - GGTTAATTACAGGCAGACTTCAAAGCCTTCAAACCTATGTAACACAACAACTAATCAGGG - 24430 
-G*LQADFKAFKPM*HNN*SG 
-VNYRQTSKPSNLCNTTTNQG 
LITGRLQSLQTYVIQQLIRA 
24481 - CTGCTGAAATCAGGGCTTCrGCTAATCTTGCTGCTACTAAAATGTCTGAGTGTGTTCTTG - 24540 
-LLKSGLLLILLLLKCLSVFL 
-C*NQGFC*SCCY*NV*VCSW 
AEIRASANLAATKMSECVLG 
24541 - GACAATCAAAAAGAGTTGACTTTTGTGGAAAGGGCTACCACCTT&TGTCCTTCCCACAA6 - 24600 
-DNQKELTFVERATTLCPSHK 
-T IKKS*LLWKGI>EPYVLPTS 
QSKRVDFCGKGYHLMSFPQA 
24601 - CAGCCCCGCATGGTGT'IGTCTTCCTACATGTCACGTATGTGCCATCCCAGGAGAGGAACT - 24660 
-QPRMVLSSYMSRMCHPRRGT 
-S PAWCCLPTCHVCAIPGEEL 
APHGVVFLHVTYVPSQBRNF 
24 661 - TCACCACAGCGCCAGCAATTTGTCATGAAGGCAAAGCATACTTCCCTCGTGAAGGTGTTT - 24720 
-SPQRQQFVMKAKHTSLVKVF 
-HHSASNLS*RQSILPS*RCF 
TTAPAICHEGKAYFPREGVF 
24721 - TTGTGTTTAATGGCACTTCTTGGTTTATTACACAGAGGAACTTCTTTTCTCCACAAATAA - 24780 
-LCLMALLGLLHRGTSFLHK* 
-CV*WHFLVYYTEELLFSTNN 
VFNGTSWFITQRNFFSPQII 
24781 - TTACTACAGACAATACATTTGTCTCAGGAAATTGTGATGTCGTTATTGGCATCATTAACA - 24840 
-LLQTIHLSQEIVMSLLASLT 
-YYRQYICLRKti*CRYWHH*'Q 
TTDNTFVSGNCDVVIGIINN 
24841 - ACACAGTTTATGATCCTCTGCAACCTGAGCTTGACTCATTCAAAGAAGAGCTGGACAAGT - 24900 
-TQFMILCNLSIiTHSKKSWTS 
-HSL*SSAT*A*LIQRRAGQV 
TVYDPLQPELDSFKEELDKY 
24901 - ACT1CAAAAATCATACATCACCAGATGTTGATCTTGGCGACATTTCAGGCATTAACGCTT - 24960 
-TSKIIHHQMLILATFQALTL 
-LQKSYITRC*SWRHFRH*RF 
FKNHTSPDVDLGDISGINAS 
24961 - CTGTCGTCAACATTCAAAAAGAAATTGACCGCCTCAATGAGGTCGCTAAAAATTTAAATG - 25020 
-LSSTFKKKIiTASMRSLKI*M 
-CRQHSKRN*PPQ*GR*KFK* 
VVNIQKEIDRLNEVAKNLNE 
25021 - AATCACTCATTGACCTTCAAGAATTGGGAAAATATGAGCAATATATTAAATGGCCTTGGT - 25080 
-NHSLTFKNWENMSNILNGLG 
-ITH*PSRIGKI*AIY*MALV 
SLIDLQELGKYEQYIKWPWY 
25081 - ATGTTTGGCTCGC-CTTCATTGCTGGACTAATTGCCATCGTCATGGTTACAATCTTGCTTT - 25140 
-MFGSASLLD*LPSSWLQSCF 
-CLARLHCWTNCHRHGYNLAL 
VWLGFIAGLIAIVMVTILLC 
25141 - GTTGCATGACTAGTTGTTGCAGTTGCCTCAAGGGTGCATGCTCTTGTGGTTCTTGCTGCA - 25200 
-VA* LVVAVASRVHALVVLAA 
-LHD*LLQLPQGCMLLWFLLQ 
CMTSCCSCLKGACSCGSCCK 
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25201 - AGTTTGMGAGGATGACTCTGAGCCAGTTCTCAAGGGTGTCAAATTACATTACftCATAAA - 25260 
-S LMRMTLSQFSRVSNY I THK 
-V**G*L*ASSQGCQITLHIN 
FDEDDSEPVLKGVKLHYT*T 
25261 - CGAACTTATGGATTTGTTTATGAGATTTTTTACTCTTGGATCAATTACTGCACAGCCAGT - 25320 
-RTYGFVYEIFYSWINYCTAS 
-ELMDLFMRFFTLGSITAQPV 
NLWICL*DFLLLDQLLHSQ* 
25321 - AAAAATTGACAATGCTTCTCCTGCAAGTACTGTTCATGCTACAGCAACGATACCGCTACA - 25380 

- K N * QCFSCKYCSCYSNDTAT 
-KIDNASPASTVHRTATIPLQ 

KLTMLLLQVLFMLQQRYRYK 
25381 - AGCCTCACTCCCTTTCGGAT jGCT'IGrTATTGGCGTTGCATTTCTTGCTGTTTTTCAGAG - 254 40 
-SLTPFRMACYWRCISCCFSE 
-ASLPFGWLVIGVAFLAVFQS 
PHSLSDGLLLALHFLLFFRA 
25441 - CGCTACCAAAATAATTGCGCTCAATAAAAGATGGCAGCTAGCCCTTTATAAGGGCTTCCA - 25500 
-RYQNNCAQ*KMAASPL*GLP 

- A T K I IALNKRWQLALYKGFQ 

LPK*IiRSIKDG3*PFIRASS 
25501 - GTTCATTTGCAATTTACTGCTGCTATTTGTTACCATCTATTCACATCTTTTGCTTGTCGC - 255 50 
-VHLQFTAAICYHLFTSFACR 
-FICNLLLLFVTIYSHLLLVA 
SFAIYCCYLLPSIHIFCLSL 
25561 - TGCAGGTAAGGAGGCGCAATTTTTGTACCTCTATGCCTTGATATAXTTTCTACAATGCAT - 25620 
-CR*GGAIFVPLCLD,IFSTMH 
-AGKEAQFLYLYALIYFLQCI 
QVRRRNFCTSMP*YIFYNAS 
25621 - CAACGCATGTAGAATTATTATGAGATGTTGGCTTTGTTGGAAGTGCAAATCCAAGAACCC - 25680 
-QRM*NYYEMI)ALLEVQIQEP 
-NACRIIMRCMLCWKCKSKNP 
THVELL* DVGFVGSANPRTH 
25681 - ATTACTTTATGATGCCAACTACTTTGTTTGCTGGCACACACATAACTATGACTACTGTAT - 257 40 
-ITL*CQLLCLLAHT*L*LLY 
LLY DANYFVCWHTHNY DY CI 
YFMMPTTLFAGTHITMTTVY 
25741 - ACCATATAACAGTGTCACAGATACAATTGTCGTTACTGAAGGTGACGGCATTTCAACACC - 25800 
-TI*QCHRYNCRY*R*RHFNT 
-PYNSVTDTIVVTEGDGISTP 
HITVSQIQIiSLLKVTAFQHQ 
25801 - AAAACTCAAAGAAGACTACCAAATTGGTGGTTATTCTGAGGATAGGCACTCAGGTGTTAA - 25860 
-KTQRRLPNWWLF*G*ALRC* 
-KLKEDYQIGGYSEDRHSGVK 
NSKKTTKL VVILRIGXQVLK 
25861 - AGACTATGTCGTTGTACATGGCTATTTCACCGAAGTTTACTACCAGCTTGAGTCTACACA - 25920 
-RLCRCTWLFHRSLLPA*VYT 

- DYVVVHGYFTEVYYQLESTQ 

TMSLYMAISPKFTTSLSLHK 
25921 - AATTACTACAGACACTGGTATTGAAAATC-CrACATTCTTCATCTTTAACAAGCTTGTTAA - 25980 
-NYYRHWY*KCYI LHL* Q A C * 

- ITTDTGIENATFFIFNKLVK 

LLQTLVLKMLHSSSLTSLLK 
25981 - AGACCCACCGAATGTGCAAATACACACAATCGACGGCTCTTCAGGAGTTGCTAATCCAGC - 26040 
-RPTECANTHNRRLFRSC*SS 
-DPPNVQIHTIDGSSGVANPA 

THRMCKYTQSTALQELLIQQ 
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26041 - AATGGATCCAATTTATGATGAGCCGACGRCGACTACTAGCGTGCCTTTGTAAGCaCftAGA - 26100 
-NGSNL* *ADDDY*RAFVSTR 

- MDPIYDEPTTTTSV?L*AQE 

WIQFMMSRRRLLACLCKHKK 
26101 - AAGTGAGTACGAACTTATGTACTCATTCGTTTCGGAAGAAACAGGTACGT'TAATAGTTAA - 26160 
-K*VRTYVLIRFGRNRYVNS* 
-SEYELMYSFVSEETGTLIVN 
VSTNLCTHSFRKKQVR* * L I 
26161 - TAGCGTACTTCTTTT1CTTGCTTTCGTGGTATTCTTGCTAGTCACACTAGCCATCCTTAC - 26220 
-*RTSFSCFRGILASHTSHPY 
-SVLLFLAFVVFLLVTLAILT 

- AYFFFLLSWYSC*SH* P S L L 

26221 - TGCGCTTCGATXGTGTGCGTACTGCTGCAATATTGTTAACGTGAGTTTAGTAAAACCAAC - 2 62 BO 
-CASIVCVI>LQYC*REFSKTN 
-ALRLCAYCCNIVNVSLVKPT 
RFDCVRTAAILLT*V* *NQR 
26281 - GGTTTACGTCTACTCGCGTGTTAAAAATCTGAACTCTTCTGAAGGAGTTCCrGATCTTCT - 26340 
-GLRLLAC*KSELF*RSS*SS 
-VYVYSRVKNLNSSEGVPDLL 
FTSTRVLKI*TLLKEFLIFW 
26341 - GGTCTAAACGAACTAACTATTATTATTATTCTGTTTGGAACTTTAACATTGCTTATCATG - 26400 
-G LN ELT I 1 1 I LFGT LTLLIM 
-V*TN*LLLLFCLEL*HCIiSW 
SKRTNYYYYSVWNFNIAYHG 
26401 - GCAGACAACGGTACTATTACCGTTGAGGAGCTTAAACAACTCCTGGAACAATGGAACCTA - 26460 
-ADNGT ITVEELKQLLEQWNL 
-QTTVLLPLR3LNNSWNNGT* 
RQRYYYR*GA*TTPGTMEPS 
26461 - GTAflTAGGTTTCCTATTCCTAGCCTGGATTATGTTACTACAATTTGCCTATTCTAATCGG - 26520 
-VIGFLFLAWIMLLQFAYSNR 
-**VSYS*PGLCYYNLPILIG 

- NRFPIPSLDYVTTICLF*SE 

26521 - AACAGGTTTTTGTACATAATAAAGCTTGTTTTCCTCTGGCTCTTGTGGCCAGTAACACTT - 26580 
-NRFLY I IKLVFLWLLW PVTL 
-TGFCT* *SLFSSGSCGQ*HL 

- QVFVHNKACFPLALVASNTC 

26581 - GCTTGTTTTGTGCTTGCTGTTGTCTACAGAATTAATTGGGTGACTGGCGGGATTGCGATT - 2 6640 
-ACFVLAVVYRINWVTGGIAI 
-LVLCLLLSTELIG*LAGLRL 
LFCACCCLQN*LGDWRDCDC 
26641 - GCAATGGCTTGTATTGTAGGCTTGATGTGGCTTAGCTACTTCGTTGCTTCCTTCAGGCTG - 26700 
-AMACIVGLMWLSYFVASFRL 
-QWLVL*A*CGLATSLLPSGC 
NGLYCRLDVA*LLRCFLQAV 
26701 - TTTGCTCGTACCCGCTCAATGTGGTCATTCAACCCAGAAACAAACATTCTTCTCAATGTG - 26760 
-FARTRSMWSFNPKTNILLNV 
-LLVPAQCGHSTQKQTFFSMC 
CSYPLNVVIQPRNKHSSQCA 
26761 - CCTCTCCGGGGGACAATTGTGACCAGACCGCTCATGGAAAGTGAACTTGTCATTGGTGCT - 26820 
-PLRGT IVTRPLMESELVIGA 
-LSGGQL* PDRS WKVNLSLVL 
SPGDNCDQTAHGK*TCHWCC 
26821 - GTGATCATTCGTGGTCACTTGCGAATGGCCGGACACTCCCTAGGGCGCTGTGACATTAAG - 26830 
-VIIRGHLRMAGHSLGRCDIK 
-*SFVVTCEWPDTP*GAVTLR 
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26881 - GACCTGCCAAAAGAGATCACTGTGGCTACATCACGAACGCTTTCTTATTACAAATTAGGA - 26940 
-DLPKE I TVATSRTLS Y Y K L G 
-TCQKR3LWLHHERFLITN*E 
PAKRDHCGYITNAFLLQIRS 
26941 - GCGTCGCAGCGTGTAGGCACTGATTCAGGTTTTGCTGCATACAACCGCTACCGTATTGGA - 27000 
-ASQRVGTDSGFAAYNRYRI G 
-RRSV'-'ALIQVLLHTTATVLE 
*H*FRFCCIQP 
rACAGACCACGCCGGTAGCAACGACAATATTC 
-NYKLNTDHAGSNDNIALLVQ 
-TIN* IQTTPVATTILLC*YS 
L* IKYRPRR*QRQYCFASTV 
27061 - TAAGTGACAACAGATGTTTCATCTTGTTGACTTCCAGGTTACAATAGCAGAGATATTGAT - 27120 
-*VTTDVSSC*LPGYNSRDI D 
-K*QQMFHLVDFQVTIAEILI 
SDNRCFI1 ) LTSRLQ*QRY*L 
27121 - TATCATTATGAGGACTTTCAGGATTGCTATTTGGAATCTTGACGTTATAATAAGTTCAAT - 27180 
-YHYEDFQDCYLES*RYNKFN 
-IIMRTFRIAIWNLDVIISSI 
SL*GLSGLLFGILTL* * V Q * 
27181 - AGTGAGACAATTATTTAAGCCTCTAACTAAGAAGAATTATTCGGAGTTAGATGATGAAGA - 27240 
-SETI I *ASN*EELFGVR* *R 
-VRQLFKPLTKKNYSELDDEE 
*DNYLSL*LRRIIRS*MMKN 
27241 - ACCTATGGAGTTAGATTATCCATAAAACGAACATGAAAATTATTCTCTTCCTGACATTGA - 27300 
-TYGVRLS I KRT*KLFSS * H* 
-PMELDYP*NEHENYSLPDID 
L W S * I IHKTNMKIILFLTLI 
27301 - TTGTATTTACATCTTGCGAGCTATATCACTATCAGGAGTGTGTTAGAGGTACGACTGTAC - 27360 
-LYLHLASYITIRSVLEVRLY 
-CIYILRAISLSGVC*RYDCT 
VFTSCELYHYQECVRGTTVL 
27361 - TACTAAAAGAACCTTGCCCATCAGGAACATACGAGGGCAATTCACCATTTCACCCTCTTG - 27420 
-Y*KNLAHQEHTRAIHHFTLL 
-TKRTLPIRNIRGQFIISPSC 
LKEPCPSGTYEGNSPFHPIjA 
27421 - CTGACAATAAATTTGCACTAACTTGCACTAGCACACACTTTGCTTTTGCrTGTGCTGACG - 27480 

- L T INLH * LALAHTLLLLVLT 

- * Q * ICTNLH*HTLCFCLC*R 

DNKFALTCTSTHFAFACADG 
27481 - GTACTCGACATACCTATCAGCTGCGTGCAAGATCAGTTTCACCAAAACTTTTCATCAGAC - 27540 
-VLDIPISCVQDQFHQNFSSD 
-YSTYLSAACKISFTKTFHQT 
TRHTYQLRARSVSPKLFIRQ 
27541 - AAGAGGAGGTTCAACAAGAGCTCTACTCGCCACTTTTTCTCATrGTTGCrGCTCTAGTAT - 27600 
-KRRFNKSSTRHFFSLLLL*Y 
-RGGSTRALLATFSHCCCSSI 
EEVQQELYSPLFLIVAALVF 
27601 - TTTTAATACTTTGCTTCACCATTAAGAGAAAGACAGAATGAATGAGCrCACTTTAATTGA - 27660 
-F*YFASPLRERQNE*AHFN* 
-FNTLLHH*EKDRMNEIiTLID 
LILCFTIKRKTE*MSSL*LT 
27661 - CTTCTATITGTGCTTTTTAGCCTTTCTGCTATTCCTTGTTriAATAATGCTTATTATATT - 27720 
-LLFVLFSLSAIPCFNNAYYI 

- FYLCFLAFLLFLVLIMLIIF 

SICAF*PFCYSLF**CLLYF 
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27721 - TTGGTTTTCACTCGAMTCCAGGATCTAGAAGAACCTTGTACCAAAGTCTAAACGAACAT - 27780 
-LVFTRNPGSRRTLYQSLNEH 
-WFSLEIQDLEEPCTKV i TNM 
GFHSKSRI*KNLVPKSKRT* 
27781 - GAAACTTCTCATTGTrCTGACTTGTATTTCTCTATGCAOTTGCATATGCACTGTAGTACA - 278 40 
-ETSHCFDLYFSMQLHMHCST 
-KLLIVLTCI3LCSCICTVVQ 
NFSIiF*I)VFLYAVAYAL*YS 
27841 - GCGCTGTGCATCTAATAAACCTCATGTGCTTGAAGATCCTTGTAAGGTACAACACTAGGG - 27900 
-ALCI*"TSCA*RSL*GTTLG 
-RCASNKPHVLEDPCKVQH*G 
AVHLINLMCLKILVRYNTRG 
27 901 - GTAATACTTATAGCACTGCTTGGCTTTGTGCTCTAGGAAAGGTTTTACCTTTTCATAGAT - 27960 
-VI LIALLGFVL* E R F Y L F I D 
- *YL* HCLALCSRKGFTFS*M 
NTYSTAWLCALGKVLPFHRW 
27961 - GGCACACTATGGTTCAAACATGCACACCTAATGTTACTATCAACTGTCAAGATCCAGCTG - 28020 
-GTLWFKHAHLMLLSTVKIQIi 
-AHYGSNMHT*CYYQLSRSSW 
HTMVQTCTPNVTINCQDPAG 
28021 - GTGGTGCGCTTATAGCTAGGTGTTGGTACCTTCATGAAGGTCACCAAACTGCTGCATTTA - 28080 
-VVRL*LGVGTFMKVTKLLHL 
-WCAYS*VLVPS*RSPNCCI* 
GALIARCWYLHEGHQTAAFR 
28081 - GAGACGTACTTGTTGTTTTAAATAAACGAACAAATTAAAATGTCTGATAATGGACCCCAA - 28140 
-ETYLLF*INEQIKMSDUGPQ 
-RRTCCFK* TNKLKCLIMDPN 
DVLVVLNKRTN*NV**WTPI 
28141 - TCAAACCAACGTAGTGCCCCCCGCATTACATTTGGTGGACCCACAGATTCAACTGACAAT - 28200 
-SNQRSAPRITFGGPTDSTDN 
-QTNVVPPALHLVDPQIQLTI 
KPT*CPPHYIWWTHRFN*Q* 
28201 - AACCAGAATGGAGGACGCAATGGGGCAAGGCCAAAACAGCGCCGACCCCAAGGTTTACCC - 28260 
-NQNGGRNGARPKQRRPQGLP 
-TRMEDAMGQGQNSADPKVYP 
PEWRTQWGKAKTAPTPRFTQ 
28261 - AATAATACTGCGTCTTGGTTCACAGCTCTCACTCAGCATGGCAAGGAGGAACTTAGATTC - 28320 
-NNTASWFTALTQHGKEELRF 
I ILRLGSQLSLSMARRNLDS 
*YCVLVHSSHSAWQGGT*IP 
28321 - CCTCGAGGCCAGGGCGTTCCAATCAACACCAATAGTGGTCCAGATGACCAAATTGGCTAC - 28380 
-PRGQGVPINTNSGPDDQIGY 
-LEARAFQSTPIVVQMTKLAT 
SRPGRSNQHQ' , 'WSR*PNWLL 
28381 - TACCGAAGAGCTACCCGACGAGTTCGTGGTGGTGACGGCAAAATGAAAGAGCTCAGCCCC - 28440 
-YRRATRRVRGGDGKMKELSP 
-TEELPDEFVVVTAK*KSSAP 
PKSYPTSSWW*RQNERAQPQ 
28441 - AGATGGTACTTCTATTACCTAGGAACTGGCCCAGAAGCTTCACTTCCCTACGGCGCTAAC - 28500 
-RWYFYYLGTGPEASLPYGAN 
-DGTSIT*ELAQKLHFPTALT 
MVLLLPRNWPRSFTSLRR*Q 
28501 - AAAGAAGGCATCGTATGGGTTGCAACTGAGGGAGCCTTGAATACACCCAAAGACCACATT - 28560 
-KEGIVWVATEGALNTPKDHI 
-KKASYGLQLREP* IHPKTTL 
RRHRMGCN*GSLEYTQRPHW 
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28561 - GGCACCCGCAATCCTAATAACAATGCTGCCACCGTGCTACAACTTCCTCAAGGAACAACA - 28620 
-GTRNPNNNAATVLQLPQGTT 
-APAILITMJjPPCYNFLKEQH 
HPQS*-*QCCHRATT33RNNI 
28621 - TTGCCAAAAGGCTTCrACGCAGAGGGAAGCAGAGGCGGCAGTCAAGCCTCTTCTCGCTCC - 2B680 
-LPK 3F YAEGSRGGSQASSRS 
CQ K A S T QREAEAAV KP LLAP 
AKRLLRRGKQRRQSSLFSLL 
28681 - TCATCACGTAGTCGCGGTAATTCAAGAAATTCAACTCCTGGCAGCAGTAGGGGAAATTCT - 28740 
-SSRSRGNSRNSTPGSSRGNS 
-HHVVAVIQEIQLLAAVGEIL 
IT* SR*FKKFNSWQQ*GKFS 
28741 - CCTGCTCGAATGGCTAGCGGAGGTGGTGAAACTGCCCTCGCGCTATTGCTGCTAGACAGA - 28800 
-PARMASGGGETALALLLLDR 
-LLEWLAEVVKLPSRYCC*TD 
CSNG*RRW*NCPRAIAARQI 
28801 - TTGAACCAGCTTGAGAGCAAAGTTTCTGGTAAAGGCCAACAACAACAAGGCCAAACTGTC - 28860 
-LNQLESKVSGKGQQQQGQTV 

- *TSLRAKFLVKANNNKAKLS 

EPA^EQSFW* RPTTTRPNCH 
28 861 - ACTAAGAAATCTGCTGCTGAGGCATCTAAAMGCCTCGCCAAAAACGTACTGCCACAAAA - 28 920 
-TKKSAAEASKKPRQKRTATK 
-LRNLLLRHLKSLAKNVLPQN 
*EICC*GI*KASPKTYCHKT 
2 8921 - CAGTACAACGTCACTCAAGCATTTGGGAGACGTGGTCCAGAACAAACCCAAGGAAATTTC - 28980 
-QYNVTQAFGRRGPEQTQGNF 
-STTSLKHLGDVVQNKPKEIS 
VQRHSSIWETWSRTNPRKFR 
28981 - GGGGACCAAGACCTAATCAGACAAGGAACTGATTACAAACATTGGCCGCAAATTGCACAA - 29040 
-GDQDtilRQGTDYKHWPQIAQ 
-GTKT*SDKELITNIGRKLHN 
GPRPNQTRN*LQTLAANCTI 
29041 - TTTGCTCCAAGTGCCTCTGCATTCTTTGGAATGTCACGCATTGGCATGGMGTCACACCT - 29100 
-FAPSASAFFGMSRIGMEVTP 
LLQVPLHSLECHALAWKSHL 
CSKCLCILWNVTHWHGSHTF 
29101 - TCGGGAACATGGCTGACTTATCATGGAGCCATTAAATTGGATGACAAAGATCCACAATTC - 29160 
-SGTWLTYHGAIKLDDKDPQF 
-REHG*LIMEPLNWMTKIHNS 
GNMADLSWSK*IG*QRSTIQ 
29161 - AAAGACAACGTCATACTGCTGAACAAGCACATTGACGCATACAAAACATTCCCACCAACA - 29220 
-KDNVILLNKHIDAYKTFPPT 
-KTTSYC*TSTLTHTKHSHQQ 
RQRHTAEQAF. *RIQNIPTNR 
29221 - GAGCCTAAAARGGACAAAAAGAAAAAGACTGATGAAGCTCAGCCTTTGCCGCAGAGACM - 2 9280 
-E PKKDKKKKTDEAQPLPQRQ 
SLKRTKRKRLMKLSLCRRDK 
A i KGQKEKD**SSAFAAETK 
29281 - AAGAAGCAGCCCACTGTGACTCTTCTTCCTGCGGCTGACATGGATGATTTCTCCAGACAA - 29340 
-KKQPTVTLLPAADMDDFSRQ 
-RSSPL*LFFLRLTWMISPDN 
EAAHCDSSSCG*HG*FLQTT 
29341 - CTTCAAAATTCCATGAGTGGAGCTTCTGCTGATTCAACTCAGGCATAAACACTCATGATG - 29400 
-LQNSMSGASADSTQA*TIiMM 

- FKIP*VELLLIQLRHKHS** 

SKFHEWSFC*FNSGINTHDD 
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. - ACCACACAAGGCAGATGGGCTATGTAAACGrTTTCGCAATTCCGTTTACGATACATAGTC - 2 94 60 
-TTQGRWAM*TFSQFRLRYIV 
-PHKADGLCKRFRNSVYDT*S 
HTRQMGYVNVFAIPFTIHSL 



-TLVQNEFS*LNSTSRFS*L* 
LLCRMNSRN i TAQVGLVNFN 
29521 - ATCTCACATAGCAATCTTTAATCAATGTGTMCATTAGGGAGGACTTGAAAGAGCCACCA - 29580 
-ISHSNL*SMCNIREDLKEPP 
-SHIAIFNQCVTLGRT*KSHH 
LT*QSLINV*H*GGLERATT 
29581 - CATTTTCATCGAGGCCACGCGGAGTACGATCGAGGGTACAGTGAATAATGCTAGGGAGAG - 29640 
-HFHRGHAEYDRGYSE* C*GE 
-IFIEATRSTIEGTVNNARES 
FSSRPRGVRSRVQ* IMLGRA 
29641 - CTGCCTATATGGAAGAGCCCTAATGTGTAAAATTAATTTTAGTAGTGCTATCCCCATGTG - 29700 
-LPIWKSPNV*N*F* *CYPHV 
-CLYGRAliMCKINFSSAIPM* 
AYMEEP*CVKLILVVLSPCD 
29701 - ATTTTAATAGCTTCTTAGGAGAATGACAAAAAAAAAAAAAAA - 29742 
-ILIAS *ENDKKKKKX 
-F**LLRRMTKKKKX 
FNSFLGE*QKKKKX 
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1 _ TTTTTTTTTTTTTTTGTCATTCTCCTAAGAAGCTATTAA?VATCACATGGGGflTAGCaCTA - 60 
-FFFFFVILLRSY*NHMGIAL 
-FFFFLSFS*EAIKITWG*HY 
FFFFCHSPKKLIiKSIlG D 3TT 
61 - CTAAAATTAATTTTACACArfAGGGCTCrrCCATATAGGCAGCTCTCCCTAGCATTATTC - 120 
-LKLILHIRALPYRQLSLALF 
-*N*FYTLGLFHIGSSP*HYS 
KINFTH*GSSI*AALPSIIH 
121 - ACrGTACCCrCGATCGTACrCCGCGTGGCCTCGATGAAAATGTGGTGGCTCTTTCAAGTC - 180 
-TVPSI VLRVASMKMWWLFQV 
-LYPRSYSAWPR*KCGGSFKS 
CTLDRTPRGLDENVVALSSP 
181 - CTCCCTAATGTTACACATTGATTAAAGATTGCTATGTGAGATTAAAGTTAACTAAACCTA - 240 
-LPNVT H * L K I A M * D * S * LNL 
-SLMLHID*RLLCEIKVN*TY 
P*CYTLIKDCYVRLKLTKPT 
241 - CTTGTGCTGTTTAGTTACGAGAATTCATTCTGCACAAGAGTAGACTATGTATCGTAAACG - 300 
-LVLFSYENSFCTRVDYVS*T 
-LCCLVTRIHSAQE*TMYRKR 
CAV*LREFILHKSRLCIVNG 
301 - GAATTGCGAAAACGTTTACATAGCCCATCTGCCTTGTGTGGTCATCATGAGTGTTTATGC - 360 
-ELRKRLHS PSALCGHHECLC 
-NCENVYIAHLPCVVIMSVYA 
IAKTFT*PICLVWSS*VFMP 
361 - CTGAGTTGAATCAGCAGAAGCTCCACTCATGGAATTTTGAAGTTGTCTGGAGAAATCATC - 4 20 
-LS* ISRSSTHGILKLSGEII 
-*VESAEAPLMEF*SCLEKSS 
ELNQQKLHSWNFEVVWRNHP 
421 - CATGTCAGCCGCAGGAAGAAGAGTCACAGTGGGCTGCTTCTTTTGTCTCTGCGGCAAAGG - 4 80 
-HVSRRKKSHSGLLLLSLRQR 
-MSAAGRRVTVGCFFCLCGKG 
CQPQEEESQWAASFVSAAKA 
481 - CTGAGCTTCATCAGTCTTTTTCTTTTTGTCCTTTTTAGGCTCTGTTGGTGGGAATGTTTT - 540 
-LSFISLFLFVLFRLCWWECF 
-*ASSVFFFLSFLGSVGGNVL 
ELHQSFSFCPF*ALLVGMFC 
541 - GTATGCGTCAATGTGCTTGTTCAGCAGTATGACGTTGTCTTTGAATTGTGGATCTTTGTC - 600 
-VCVNVLVQQYDVVFELWIFV 
-YASMCLFSSMTLSLNCGSLS 
MRQCACSAV*RCL*IVDLCH 
601 - ATCCAATTTAATGGCTCCATGATAAGTCAGCCATGTTCCCGAAGGTGTGACTTCCATGCC - 660 
-IQFNGSMI SQPCSRRCDFHA 
-SNLMAP**VSHVPEGVTSMP 
PI*WLHDKSAMFPKV*LPCQ 
661 - AATGCGTGACATTCCAAAGAATGCAGAGGGACTTGGAGCAAATTGTGCAATTTGCGGCCA - 720 
-NA*HSKECRGTWSKLCNLRP 
-MRDIPKNAEALGANCAICGQ 
CVTFQRMQRHLEQIVQFAAN 
721 - ATGTTTGTAATCAGTTCCTTGTCTGATTA3GTCTTGGTCCCCGAAATTTCCTTGGGTTTG - 780 
-MFVISSLSD*VLVPEISLGL 
-CL*SVPCLIRSWSPKFPWVC 
VCNQFLV* LGLGPRNFLGFV 
781 - TTCTGGACCACGTCTCCCAAATGCTTGAGTGACGTTGTACTGTTTTGTGGCAGTACGTTT - 840 
-FWTTSPKCLSDVVLFCGSTF 
-SGPRLPNA*VTLYCFVAVRF 
LDHVSQMLE*RCTVLWQYVF 
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841 - TTGGCGAGGCTTTTTAGATGCCTCAGCAGCAGATTTCTTAGTGACAGTTXGGCCTTGTTG - 900 
-LAKLFRCL 35RFLS DSLALL 
-WRGFLDASAADFLVTVWPCC 
GEAF*MPQQQIS**QFGLVV 
901 - TTGTTGGCCTTTACCAGAAACTTTGCTCTCAAGCTGGTTCAATCTGTCTAGCAGCAP.TAG - 960 
-LLAFTRNFALKLVQSV*QQ* 
-CWPLPETLL3SWFKL S S3NS 
VGLYQKLCSQAGSICLAAIA 
961 - CGCGAC-GGCAGTTTCACCACCTCCGCTAGCCATTCGAGCAGGAGAATTTCCCCTACTGCT - 1020 
-REGSFTTSASHSSRRISPTA 
-ARAVSPPPLAIRAGEFPLLL 
RGQFHHLR* PFEQENFPYCC 
1021 - GCCAGGAGTTGAATTTCTTGAATTACCGCGACTACGTGATGAGGAGCGAGAAGAGGCTTG - 1080 
-ARS*IS*ITATT**GARRGL 

- PGVEFLELPRLRDEEREEA* 

QELNFLNYRDYVMRSEKRLD 
1081 - ACTGCCGCCTCTGCTTCCCTCTGCGTAGAAGCCTTTTGGCAATGTTGTTCCTTGAGGAAG - 1140 
-TAASASLCVEAFWQCCSLRK 
-LPPLLPSA*KPFGNVVP*GS 
CRLCFPLRRSLLAMLFLEEV 
1141 - TTGTAGCACGGTGGCAGCATTGTTATTAGGATTC-CGGGTGCCAATGTGGTCTTTGGGTGT - 1200 
-L*HGGSIVIRIAGANVVFGC 
-CSTVAALLLGLRVPMWSLGV 
VARWQHCY*DCGCQCGLWVY 
1201 - ATTCAAGGCTCCCTCAGTTGCAACCCATACGATGCCTTCTTTGTTAGCGCCGTAGGGAAG - 1260 
-IQGSLSCN PY DAFFVSAVGK 

- FKAPSVATHTMPSLLAP*GS 

SRLPQLQPIRCLLC*RRREV 
1261 - TGAAGCTTCTGGGCCAGTTCCTAGGTAATAGAAGTACCATCTGGGGCTGAGCTCTTTCAT - 1320 
-*SFWASS*VIEVPSGAELFH 
-EASGPVPR**KYHLGLSSFI 
KLLGQFLGNRSTIWG*ALSF 
1321 - TTTGCCGTCACCACCACGAACTCGTCGGGTAGCTCTTCGGTAGTAGCCAATTTGGTCATC - 138 0 
-FAVTTTNSSGSSSVVANLVI 
-LPSPPRTRRVALR + ^PIWSS 
CRHHHE LVG*LFGSSQFGHL 
1381 - TGGACCACTATTGGTGTTGATTGGAACGCCCTGGCCTCGAGGGAATCTAAGTTCCTCCTT - 1440 
-WTTIGVDWNALASRESKFLL 
-GPLLVIjIGTPWPRGNLSSSL 
DHYWC*LERPGLEGI*VPPC 
1441 - GCCATGCTGAGTGAGAGCTGTGAACCAAGACGCAGTATTATTGGGTAAACCTTGGGGTCG - 1500 
-AMLSESCE PRRSI I G * T L G S 

- pc*VRAVNQDAVLLGKPWGR 

HAE*EL*TK TQYYWVNLGVG 
1501 - GCGCTGTTTTGGCCTTGCCCCATTGCGTCCTCCATTCTGGTTATTGTCAGTTGAATCTGT - 1560 
-ALFWPCPIASSILVIVS*IC 
-RCFGLAPLRPPFWLLSVESV 
AVLALEHCVLHSGYCQLNLW 
1561 - GGGTCCACCAAATGTAATGCGGGGGGCACTACGTTGGTTTGATTGGGGTCCATTATCAGA - 1620 
-GSTKCNAGGTTLV J 'LGSIIR 
-GPPNVMRGALRWFDWGPLSD 
VHQM*CGGHYVGIiIGVHYQT 
1621 - CATTTTAATTTGTTCGTTTATTTAAAACAACAAGTACGTCTCTAAATGCAGCAGTTTGGT - 1680 
-HFNLFVYLKQQVRL*MQQFG 

- ILICSFI*NNKYVSKCSSLV 

F * FVRLFKTTSTSLNAAVW* 
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1691 - GACCTTCATGAAGGTACCAACACCTAGCTATAAGCGCACCACCAGCTGGATCTTGACAGT - 1740 
-DLHEGTNT*L i AHHQLDLDS 
-TFMKVPTPSYKRTTSWILTV 
PS*RYQHI ) AISAPPAGS*QL 
1741 - TGATAGTAACATTAGGTGTGCATGTTTGAACCATAGTGTGCCATCTATGAAAAGGTAAAA - 1800 
~* * * H ■ V C M F E P * C A I Y E K V K 
-DSNIRCACLNHSVPSMKR*N 
I V T L G V H V * T I V C H L * K G K T 
1801 - CCTTTCCrAGAGCACAAAGCCAAGCAGTGCTATAAGTATTACCCCTAGTGTTGTACCTTA - 18 60 
-PFLEHKAKQCYKYYP*CCTL 
-LS*STKPSSAISITPSVVPY 
FPRAQSQAVL*VLPIiVIiYLT 
18 61 - CMGGATCTTCAAGCACATGAGGTTTATTAGATGCACAGCGCTGTACTACAGTC-CATATG - 1920 
-QGSSST*GLLDAQRCTTVHM 
-KDLQAHEVY*MHSAVLQCIC 
RIFKHMRFIRCTALYYSAYA 
1921 - CAACTGCATAGAGAAATACAAGTCAAAACAATGAGAAGTTTCATGTTCGTTTAGACTTTG - 1980 
-QLHREIQVKTMRSFMFV*TL 
-NCIEKYKSKQ*EV3CSFRLW 
TA*RNTSQNNEKFHVRLDFG 
1981 - GTACAAGGTTCTTCTAGATCCTGGATTTCGAGTGAAAACCAAAATATAATAAGCATTATT - 2040 
-VQGSSRSWISSENQNIISII 
-YKVLLDPGE'RVKTKI * *ALL 
TRFF*ILDFE*KPKYNKHY* 
2 041 - AAAACAAGGAATAGCAGAAAGGCTAAAAAGCACAAATAGAAGTCAATTAAAGTGAGCTCA - 2100 
-KT RN SRKAKKHK* KS IKVS S 
-KQGIAERLKSTNRSQLK*AH 
NKE*QKG*KAQIEVN*SELI 
2101 - TTCATTCTGTCTTTCTCTTAATGGTGAAGCAAAGTATTAAAAATACTAGAGCAGCAACM - 2160 
-FILSFS*W*SKVLK1LEQQQ 
-SFCLSLNGEAKY*KY*SSNN 
HSVFLLMVKQSIKNTRAATM 
2161 - TGAGAAAAAGTGGCGAGTAGAGCTCTTGTTGAACCTCCTCTTGTCTGATGAAAAGTTTTG - 2220 
-*EKVASRALVEPPLV**KVL 
-EKKHRVELLLNLLLSDEKFW 
RKSGE*SSC*TSSCLMKSFG 
2221 - GTGAAACTGATCTTGCflCGCAGCTGATAGGTATGTCGAGTACCGTCAGCACAAGCAAAAG - 228 0 
-VKL I LHAADRYVEYRQHKQK 
-*N*SCTQLIGMSSTVSTSKS 
ETDLARS**VCRVPSAQAKA 
2281 - CAAAGTGTGTGCTAGTGCAAGTTAGTGCAAATTTATTGTCAGCAAGAGGGTGAAATGGTG - 2340 
-QSVC*CKLVQIYCQQEGEMV 
-KVCASAS*CKFIVSKRVKW* 
KCVLVQVSANLLSARG*NGE 
2341 - AATTGCCCTCGTATGTTCCTGATGGGCAAGGTTCTTTTAGTAGTACAGTCGTACCTCTAA - 2 400 
-NCPRMFLMGKVLLVVQSYL* 
IALVCS^WARFF* * YSRTSN 
LPSYVPDGQGSFSSTVVPLT 
2401 - CACACTCCTGATAGTGATATAGCTCGCAAGATGTAAATACAATCAATGTCAGGAAGAGAA - 2460 
-HTPDSDIARKM* IQSMSGRE 
-TLLIVI*LARCKYNQCQEEN 
HS***YSSQDVNTINVRKRI 
24 61 - TAATTTTCATGTTCGTTTTATGGATAATCTAACTCCATAGGTTCTTCATCATCTAACTCC - 2520 
- * FSCSFYG*SNSIGSSSSNS 
-NFHVRFMDNLTP*VLHHLTP 
IFlMFVLWII*LHRFFII*LR 
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2521 - GAATAATTCTTCTTAGTTAGAGGCTTAAfiTAATTGTCTCACTATTGAACTTATTATAACG - 2580 
- E * FFLVRGLNMCLTIELI IT 
-NNSS *lEA*IIVSIiLNLI 1 *R 
I I L L S R L K * L S H Y - T Y Y N V 

2581 - TCAAGATTCCAAATAGCAaTCCTGAAAGTCCTCATAATGATAATCAATATCTCTGCTATT - 2540 
-SRFQIAILKVLIMIINISAI 



-VTWKSTR*NICCHLLY*QSN 
-* PGSQQDETSVVTYCTSKAI 
NLEVNKMKHLL5LTVLAKQY 

2701 - ATTGTCGTTGCTACCGGCGTGGTCTGTATTTAATTTATAGTTTCCAATACGGTAGCGGTT - 2760 
-IVVATGVVCI*FIVSNTVAV 
-LSLLPAWSVFNL*FPIR*RL 
CRCYRRGLYLIYSFQYGSGC 

2761 - GTATGCAGCAAAACCTGAATCAGTGCCTACACGCTGCGACGCTCCTAATTTGTAATAAGA - 2820 
-VCSKT * I SAYTLRRS* F V I R 
-YAAKPESVPTRCDAPNL* * E 



-KRS*CSHSDLFWQVLNVTAP 
-SVRDVATVISFGRSLMSQRP 
AFVM* PQ*SLLAGP* CHSAL 
2881 - TAGGGAGTGTCCGGCCATTCGCAAGTGACCACGAATGATCACAGCACCAATGACAAGTTC - 2940 
-*GVSGHSQVTTNDHSTNDKF 
-RECPAIRK+PRMITAPMTSS 
GSVRPFASDHE* SQHQ^QVH 
2941 - ACTTTCCATGAGCGGTCTGGTCACAATTGTCCCCCGGAGAGGCACATTGAGAAGAATGTT - 3000 
-TFHERSGHNCPPERHIEKNV 
-LSMSGLVTIVPRRGTLRRMF 
FP*AVWSQLS PGEAH* EECL 



-VSGLNDHIERVRANSLKEAT 
FLG*MTTLSGYEQTA*RKQR 
. - GAAGTAGCTAAGCCACATCAAGCCTACARTACAAGCCATTGCAATCGCAATCCCGCCAGT - 3120 
-EVAKP HQAYNTSHCNRNPAS 
-K*LSHIKPTIQAIAIAIPPV 
5S*ATSSLQYKPLQSQSRQS 
. - CACCCAATTAATTCTGTAGACAACAGCAAGCACAAAACAAGCAAGTGTTACTGGCCACAA - 3180 
-HP INSVDNSKHKTSKCYWPQ 
-TQLIL*TTASTKQASVTGHK 



•EPEENKLYYVQKPVP 
■SQRKTSFIMYKNLFI 
ARGKQALLCTKTCS 



Q A R N R K 
R L G I G N 



SCLSSSTVIVPLSAMISNVK 
VV*APQR i '*YRCLP**AMLK 
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3361 - AGTTCCAAACAGAATAATAATAATAGTTAGTTCGTTTAGACCAGAAGATCAGGAACTCCT - 3420 
-SSKQNNNNS*FV*TRRSGTP 
-VPNRIIIIVSSFRPEDQELL 
FQTE****LVRLDQKIRNSF 
3421 - TCAGAAGAGTTCAGATTTTTAACACGCGAGTAGACGTAAACCGTTGGTTTTACTAAACTC - 3480 
-SEEFRFLTRE*T*TVGFTKL 
-QKSS DF'HASRRKPLVLLKS 
RRVQIFNTRVDVNRWFY*TH 
3481 - ACGTTAACAATATTGCAGCAGTACGCACACAATCGAAGCGCAGTAAGGATGGCTAGTGTG - 3540 
-TLTILQQYAHNRSAVRMASV 
-R*QYCSSTHTIEAQ*GWLV* 
VNNIAAVRTQSKRSKDG*CD 
3541 - ACTAGCAAGAATACCACCAAAGCAAGAAAAAGAAGTACGCTATTAACTATTAACGTACCT - 3600 
-TSKNTTKARKRSTLLTINVP 
-LARI PRKQEKEVRY*LLTYL 
*QEYHESKKKKYAIWY*RTC 
3 501 - GTTTCTTCCGAAACGAATGAGTACATAAGTTCGTACTCACTTTCTTGTGCTTACAAAGGC - 3660 
-VSSETNEYISSYSLSCAYKG 

- FLPKRMST^VRTHFLVLTKA 

FFRNE*VHKFVLTFliCLQRH 
3 661 - ACGCTAGTAGTCGTCGTCGGCTCATCATAAATTGGATCCATTGCTGGATTAGCAACTCCT - 3720 
-TLVVVVGSS*IGSIAGLATP 

- R**SSSAHHKLDPLLD*QLL 

ASSRRRLIINWIHCWISNS* 
3721 - GAAGAGCCGTCGATTGTGTGTATTTGCACATTCGGTGGGTCTTTAACAAGCTTGTTAAAG - 3780 
-EEPS IVC I CTFGGSLT SLLK 
-KSRRLCVFAHSVGL*QAC*R 
RAVDCVYLHIRWVFNK1VKD 
3781 - ATGAAGAATGTAGCATTTTCAATACCAGTGTCTGTAGTAATTTGTGTAGACTCAAGCTGG - 3840 
-MKNVAFS I PVSVVICV DSSW 
-*RM*HFQYQCL**FV*TQAG 
EECS IFNTSVCSNLCRLKLV 
3841 - TAGTAAACTTCGGrGAAATAGCCATGIACAACGACATAGTCTTTAACACCTGAGTGCCTA - 3900 
-**TSVK*PCTTT*SLTPECL 
-SKLR*NSHVQRHSL*HLSAY 
VNFGEIAMYNDIVFWT*VPI 

3 901 - TCCTCAGAATAACCACCAArTTGGTAGTCTTCTTTGAGTTTTGGTGTTGAAATGCCGTCA - 3960 

-SSE*PPIW*SSLSFGVEMPS 
-PQNNHQFGSLL*VLVLKCRH 

LRITTNLVVFFEFWC*NAVT 
3961 - CCTTCAGTAACGACAATTGTATCTGTGACACTGTTATATGGTATACAGTAGTCATAGTTA - 4020 
-PSVTTIVSVTLLYGIQ*S*L 
-LQ*RQLYL *HCYMVYSSHSY 

FSNDNCICDTVIWYTVVIVM 

4 021 - TGTGTGTGCCAGCAAACAAAGTAGTT3GCATCATAAAGTAATGGGTTCTTGGATTTGCAC - 4080 

-CVCQQTK* LAS* SNGFLDLH 
-VCASKQSSWHHKVMGSWICT 

CVPANKVVGIIK*WVLGFAL 
4081 - TTCCAACAAAGCCAACATCTCATAATAATTCTACATGCGTTGATGCATTGTAGAAAArAT - 4140 
-FQQSQHLIIILHALMHCRKY 
-SNKANIS* *FYMR*CIVENI 

PTKPTSHNNSTCVDAL*KIY 
4141 - ATCAAGGCATAGAGGTACAAAAATTGCGCCTCCTTACCTGCAGCGACAAGCAAAAGATGT - 4200 
-IKA*RYKNCA3LPAATSKRC 
-SRHRGTKIAPPYLQRQAKDV 

QGIEVQKLRLLTCSDKQKM* 
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4201 - GAATAGATGGTAACAAATAGCAGCAGTAAATTGCAAATGAACTGGAAGCCCTTATAAAGG - 4260 
-E *MVTNS S SKLQMNWKPL* R 
_NRW*QIAAVNCK*TGSPYKG 
IDGNK*QQ i 'IANELEALIKG 

4261 - GCTAGCTGCCATCTTTTATTGAGCGCAATTATTTTGGTAGCGCTCTGAAAAACAGCAAGA - 4320 
-ASCHLLLSAI ILVAL*KTAR 

- L A A I F Y * A Q L F W * R S E K Q G E 

*LPSFIERNYFGSALKNSKK 
4321 - AATGCAACGCCAATAACAAGCCATCCGAAAGGGAGTGAGGCTTGTAGCGGTATCGTTGCT - 4380 
-NATPI TSHPKGSEACSGIVA 
-MQRQ*QAIRKGVRIiVAVSLL 
CNANNKPSERE*GL*RYRCC 
4 381 - GTAGCATGAACAGTACTTGCAGGAGAAGCATTGTCAATTTTTACTGGCTGTGCAGTAATT - 4440 
-VA*TVLAGEALSIFTGCAVI 

- * HEQYI>QEKHCQFLLAVQ*L 

- SMN3TCRRSIVNFYWLCSN* 

4441 - GATCCAAGAGTAAAAAATCTCATAAACAAATCCATAAGTTCGTTTATGTGTAATGTAATr - 4500 
-DPRVKNLINKSISSFMCNVI 
-IQE*KIS*TNP*VRLCVM*F 
SKSKKSHKQIHKFVYV*CNL 
4501 - TGACACCCTTGAGAACTGGGTCAGAGTCATCCTCATCAAACTTGCAGCAAGAACCACAAG - 4560 
-*HP- t ELAQSHPHQTCSKNHK 
-DTLENWLRVILIKLAARTTR 
TPLRTGSESSSSNLQQEPQE 
4561 - AGCATGCACCCTTGAGGCAACTGCAACAACTAGTCATGCAACAAAGCAAGATTGTAACCA - 4 620 
-SMHP*GNCNN k SCNKARL*P 
-ACTLEATATTSHATKQDCNH 
HAPLRQLQQIiVMQQSKIVTM 
4 621 - TGACGATGGCAATTAGTCCAGCAATGAAGCCGAGCCAAACATACCAAGGCCATTTAATAT - 4 680 
-* RWQLVQQ*SRAKHTKAI * Y 
-DDGN*SSNEAEPNIPRPFNI 
TMAISPAMKPSQTYQGHLIY 
4 681 - ATTGCTCATATTTTCCCAATTCTTGAAGGTCAATGAGTGATTCATTTAAATTTTTAGCGA - 4 740 
-IAHIFPILEGQ*VIHLNF*R 
-LLIFSQFLKVNE*FI* IFSD 
CSYFPN3*RSMSDSFKFLAT 
4741 - CCTCATTGAGGCGGTCAATTTCTTTTTGAATGTTGACGACAGAAGCGTTAATGCCTGAAA - 4 800 
-FH J -GGQFLFEC*RQKR*CLK 
-LIEAVNFF1NVDDRSVNA*N 
SLRRSISF*MLTTEALMPEM 
4801 - TGTCGCCAAGATCAACATCTGGTGATGTATGATTTTTGAAGTACTTGTCCAGCTCTTCTT - 4 860 
-CRQDQHIiVMYDF*STCPALL 
-VAKINIW*CMI FEVLVQLFF 
SPRSTSGDV + FLKYLSSSSL 
4861 - TGAATGAGTCAAGCTCAGGTTGCAGAGGATCATAAACTGTGTTGTTAATGATGCCAATAA - 4 920 
-♦MSQAQVAEDHKLCC* * C Q * 
-E *VKLRLQRIINCVVNDANN 
NESSSGCRGS*TVLLMMPIT 
4921 - CGACATCACAATTTCCTGAGACAAATGTATTGTCTGTAGTAATTATTTGTGGAGAAAAGA - 4 980 
-RHHNF.LRQMYCL * * L F V E K R 
-DITIS*DKCIVCSNYLWRKE 
TSQFPETNVLSVVI ICGEKK 
4 981 - AGTTCCTCTGTG1AATAAACCAAGAAGTGCCATTAAACACAAAAACACCTTCACGAGGGA - 504 0 
-SSSV* *TKKCH*TQKHLHEG 
-VFLCNKPRSAIKHKNTFTRE 
FLCVINQEVPLNTKTPSRGK 
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5041 - aGTATGCTTTGCCTTCftTGACAAATTGCTGGCGCTGTGGTGMGTTCCTCTCCTGGGATG - 5100 
-SKLCLHDKLL ALW *S3SPGM 
-VCFAFMTKCWRCGEVPLLGW 
YALPS 'QIAGAVVKFLSWDG 

5101 - GCACATACGTGACATGTAGGAAGACMCACCATGCGGGGCTGCT1GTGGGAAGGACATAA - 5160 

- A H T * HVGRQHHAGLLVGRT * 
-HIRDM*EDNTMRGCLWBGHK 

TYVTCRKTTPCGAACGKDIR 
5161 - GGTGGTAGCCCTTTCCACAAAAGTCAACTCTTTTTGATTGTCCAAGAACACACTCAGACA - 5220 
-GGSPFHKSQliFLIVQEHTQT 
-VVALSTKVNSF* LSKNTLRH 
W * PFPQKSTLFDCPRTHSDI 
5221 - TTTTAGTAGCAGCAAGATTAGCAGAAGCCCTGATTTCAGCAGCCCTGATTAGTTGTTGTG - 52 80 

- F * *QQD*QKP*FQQP*LVVV 
-FSSSKISRSPDFSSPD^LLC 

LVAARLAEALISAALI SCCV 
52 81 - TTACATAGGTTTGAAGGCTTTGAAGTCTGCCTGTAATTAACCTGTCAATTTGTACCTCCG - 5340 
-LHRFEGFEVCL* LTCQFVPP 
-YIGLKALKSACN*PVNLYLR 
T i V*RL*SLPVINLSICTSA 
5341 - CCTCGACTTTATCAAGTCGCGAAAGGATATCATTTAGCACACTTGAAATTGCACCAAAAT - 54 00 
-PRLYQVAKGYHLAHLK1, HQN 
-LDFIKSRKDII*HT*NCTKI 
STLiSSRERISFSTLEIAPKLi 
5401 - TAGAGCTAAGTTGTTTAACAAGTGTGTTTAATGCTTGAGCATTCTGGTTAACAACGTCTT - 5460 

- * S i VV*QVCLMLEHSG*QRL 
-RAKLFNKCV*CLS ILVNNVL 

ELSCLTSVFNA*AFWLTTSC 
54 61 - GCAGCTTGCCCAATGCAGTTGATGTTGTrGTAAGTGATTCTTGAATTTGACTAATCGCCT - 5520 
-AACPMQLMLL*VILEFD*SP 
-QLAQCS*CCCK*FLNLTNRL 
SLPNAVDVVVSDS*I*LIAL 
5521 - TGTTAAATTGGTTGGCGATTTGTTTTTGGTTCTCATAGAGAACATTTTGGGTAACTCCAA - 5580 
-C*IGWRFVFGSHREHFG*LQ 
-VKLVGDLFliVLIENILGNSN 
LNWLAICFKFS*RTFWVTPM 
5581 - TGCCATTGAACCTATATGCCATTTGCATAGCAAAAGGTATTTGAAGAGCAGCGCCAGCAC - 5640 
-CH*TYMPFA*QKVFEEQRQH 
-AIEPICHLH3KRYLKSSAST 
PL>NLYAICIAKGI*RAAPAP 
5641 - CAAATGTCCATCCAGCAGTGGCAGTACCACTAACTAGAGCAGCAGTGTAGGCAGCAATCA - 5700 
-QMS IQQWQYH * LEQQCRQQS 
-KCPSSSGSTTN*SSSVGSNH 
NVHPAVAVPLTRAAV*AAII 
5701 - TATCATCAGTGAGCAGAGGTGGCAACACTGTAAGTCCATTGAACTTCTGCGCACAAATGA - 57 6(3 
-YHQ^AEVATL'VH^TSAHK* 
I 1 SEQRWQHCKS IELLRTNE 
SSVSRGGNTVSPLNFCAQMR 
5761 - GATCTCTAGCATTAATATCACCTAGGCATTCGCCATATTGCTTCATGAAGCCAGCATCAG - 5820 
-DL*H*YHLGIRHIAS*SQHQ 

- ISSINIT*AFAILLHEASIS 

SLALISPRHSPYCFMKPASA 
5821 - CGAGTGTCACCTTATTAAAGAGCAAGTCCTCAATAAAAGACCTCTTAGTTGGCTTTAGAG - 5880 
-RVSPY*RASPQ*KTS*LALE 
-ECHLIKEQVLNKRPLSWL" l "R 

SVTLLKSKSSIKDLLVGFRG 
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5881 - GGTCAGGTAATATTTGTGAAfiAATTAAAACCACCAAAATATTTCAAflGTTGGGGTTTTGT - 5940 
-GQVI FVKN*NHQNISKLGFC 
-VR*YL*KIKTTKIFQSWGFV 
S GN I CEKLKPPKYFKVGVLY 

5941 - ACATTTGTTTGACTTGAGCGAACACTTCACGTGTGTTGCGATCCTGTTCAGCAGCaATAC - 6000 

- T F V * LERTLHVCCDPVQQQY 
-HLFDLSEHFTCVAILFSSNT 

ICLT* ANTSRVLRSCSAAIP 
6001 - CTGAGAGTGCACGATTTAGTTGTGTGCAAAAGCTACCATATTGGAGAAGCAAATTAGCAC - 6060 
-LRV H DLVVCKSYHI GEAN * H 

- *ECTI*LCAKATILEKQIST 

ESARFSCVQKLPYWRSKLAH 
6061 - ATTCAGTAGAATCTCCGCAGATGTACATATTACAATCTACGGAGGTTTTAGCCATAGAAA - 6120 
-IQ*NLRRCTYYNLRRF*P*I< 
-FSRISADVHITIYGGFSHRN 
SVES PQMYILQSTEVLAIET 
6121 - CAGGCATTACTTCTGTAGTAATGCTAATTGAAAAGTTAGTAGGTATAGCAATGGTGTTAT - 6180 
-QALLL* *C*LKS**V*QWCY 
-RHYFCSNAH*KVSRYSNGVI 
GITSVVMLIEKLVGIAMVLL 
6181 - TAGAGTAAGCAATTGAACTATCAGCACCTAAAGACATAGTATAAGCCACAATAGATTTTT - 6240 

- * SKQLMYQHLKT * Y K P Q * I F 
-RVSN*TIST*RHSISHNRFL 

E*AIELSAPKDIV i ATIDFW 
6241 - GGCTAGTACTACGTAATAAAGAAACTGTATGGTAACTAGCACAAATGCCAGCTCCAATAG - 6300 
-G*YYVIKKLYGN*HKCQLQ* 

- ASTT**RNCMVTSTNASSNR 

LVLRNKETVW*LAQMPAPIG 
6301 - GAATGTCGCACTCATAAGAAGTGTCGACATGCTCAGCTCCTATAAGACAGCCTGCTTGAG - 6360 
-ECRTRKKCRHAQLL* DSL1E 
-NVALIRSVDM1SSYKTACLS 
MSHS*EVSTCSAPIRQPA*V 
6361 - TCTGGAATACATTGTTTCCAGTAGAATATATGCGCCAAGCTGGTGTGAGTTGATCTGCAT - 6420 
-SGIHCFQ*NICAKLV*VDLH 
-LEYIVSSRIYAPSWCELICM 
WNTLFPVEYMRQAGVS*SA* 
6421 - GAATTGCTGTAGAAACATCAGTGCAGTTAACATCTTGATATAGAACAGCAACTTCAGATG - 6480 
-ELL*KHQCS*HLDIEQQLQM 
-NCCRNISAVNILI*NSNFR* 
IAVETSVQLTS*YRTATSDE 
6481 - AAGCATTTGTTCCAGGTGTAATTACACTTACACCCCCAAAAGAGCAAGGTGAAAIGTCTA - 6540 
-KHLFQV*LHLHPQKSKVKCL 
-SICSRCNYTYTPKRAR*NV* 
AFVPGVITLTPPKEQGEMSN 
6541 - ATATTTCAGATGTTTTAGGAICTCGAACGGAATCAGTGAAATCAGAAACATCACGGCCAA - 6600 
-I FQMF*DLERNQ*NQKHHGQ 
-YFRCFRISNGISEIRNITAK 
ISDVLGSRTESVKSETSRPN 
6601 - ATTGTTGAAATGGTTGAAATCTCTTTGAAGAAGGAGTTAACACACCAGTACCAGTGAGTC - 6660 
-IVEMVEISLKKELTHQYQ*V 
-LLKWLKSL*RRS*HTSTSES 
C*NG*NLFEEGVNTPVPVSP 
6661 - CATTAAAATTAAAATTGACACACTGGTTCTTAATAAGGTCAGTGGATAATTTTGGTCCAC - 6720 
-H*N*N*HTGS**GQWIILVH 
-IKIKIDTLVLNKVSG*FWST 
LKLKLTHWFIjIRSVDNFGPQ 
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6721 - AAACCGTGGCCGGTGCATTTAAAAGTTCAAAAGAAAGTACTACAACTCTGTAAGGTTGGT - 6780 
-KPWPVHIjKVQKKVLQLCKVG 
-NRGRCI*KFKRKYYNSVRLV 
TVAGAFK5SKESTTTL*GW* 
6781 - AGCCAATGCCAGTAGTGGTGTMAAACCArAATCATTTAATGGCCAATAACAATTAAGAG - 6840 
-SQCQ*WCKNHNHLMANNN*E 
-ANASSGVKTII1*WPITIKS 
PMPVVV*KP t SB*NGQ i QLRA 
6841 - CAGGTGGGGTGCAAGGrTTGCCATCAGGGGAGAAAGGCACATTAGATATGTCTCTCTCAA - 6900 
-QVGCKVCHQGRKAH* ICLSQ 
-RWGARFAIRGERHIRYVSLK 
GGVQGLPSGEKGTLDMSLSK 
6901 - AGGGCCTAAGCTTGCCATGTCTAAGATACCTATATTTATAATrATAATTACCAGrTGAAG - 6960 
-RA*ACHV*DTYiyN5fN5fQLK 
-GPKLAMSKIPIFIIIITS*S 
GLSLPCLRYLYL*L*LPVEV 
6961 - TAGCATCAATGTTCCTAGTATTCCAAGCAAGGACACAACCCATGAAATCATCTGGCAATT - 7020 
-*HQCS*"YSKQGHNP*NHLAI 
-SINVPSIPSKDTTHEIIVJQF 
ASMFLVFQARTQPMKSSGNL 
7021 - TATAATTATAATCAGCAATAACACCAGTTTGTCCTGGCGCTATTTGTCTTACATCATCTC - 7080 
-YNYNQQ* HQFVLA1FVLHHL 
-IIIISNNT3LSWRYLSYIIS 
*L*SAITPVCPGAICLTSSP 
7081 - CCTTGACTACAAAAGAATCTGCATAGACATTGGAGAAGCAAAGATCATTCAACTTAGTGG - 7140 

- p * LQKNLHRHWRSKDHST * W 
-LDYKRICIDIGEAKI I Q 1. S G 

LTTKESA*TLEKQRSFNLVA 
7141 - CAGAAACGCCATAGCACTTAAAGGTTGAAAAAAATGTTGAGTTGrAGAGCACAGAGTAAT - 7200 
-QKRHST * RLKKMLSCRAQSN 
-RNAIALKG*KKC*VVEHRVI 
ETP*HLKVEKNVEL*STE*S 
7201 - CAGCAACACAATTAGAAATTTTTTTTCTCTCCCATGCATAGACAGAAGGGAATTTAGTAG - 72 60 
-QQHN *KFFFSPMHRQKGI* * 
-SNTIRNFFSLPCIDRREFSS 
ATQLEIFFIiSHA*TEGNLVA 
7261 - CATTAAAAACCTCTCCAAAAGGACACAAGTTTGTAflTATTAGGGAATCTCACAACATCTC - 7320 
-H*KPLQKDTSL i Y-*GISQHL 
-IKNLSKRT QVCNIRESHNIS 
LKTSPKGHKFVILGNLTTSP 
7321 - CTGAGGGAACAACCCTGAAATTAGAGGTCTGGTAAATTCCTTTGTCAATCTCAAAGCTCT - 7380 
-LREQP*N J 'R3GKFLCQSQSS 

- * GNN PE I RGLVNSFVNLKAIi 

EGTTLKLEVW*IPLSISKLL 
7381 - TAACAGAGCATTTGAG1TCAGCAAGTGGATTTTGAGAACAATCAACAGCATCTGTGATTG - 7440 
-*QSI*VQQVDFENNQQHIi*L 
-NRAFEFSKWILRTINSICDC 
TEHLSSASGF*EQSTASVIV 
7441 - TACCATTTTCATCATACTTGAGCATAAATGTAGTTGGCTTTAAATAGCCAACAAAATAGG - 7 500 
-YHFHHT*A*M*LALNSQQNR 
-TIFIILEHKCSWL*IANKIG 
PFSSYLSINVVGFK*PTK*A 
7501 - CTGCAGCTGACGTGCCCCAAATGTCTTGAGCAGGTGAAAAGGCTGTAAGAATGGCTCTAA - 7560 
-LQLTCPKCLEQVKRL*EWL'* 
-CS*RAPNVLSR*KGCKNGSK 
A A D V P Q M 3 * A G E K A V R M A L K 
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7561 - AATTTGTAATGTTMTACCAAGAGGCAACTTMAAATAGGTTTCiVAAOTGTTRAARCCAG - 7620 
-NL*C*YQEAT*K*VSKC*NQ 
-ICNVNTKRQLKNRFQSVKTR 
FVMLIPRGNLKIGFKVL KPE 
7621 - AAGGTAGftTCACGAACTACATCTATAGGTTGATAGCCCTTATAAACATAGAGAAACCCAT - 7680 
-KVDHELHL*VDSPYKHRETH 
-R*ITNYIYRLIALINIEKPI 
GR3RTTSIG**PL*T*RNPS 
7S81 - CTTTATTTTTAAACACAAACTCTCGTAAGTGTTTAAAATTACCTGACTTTTCTGAAACAT - 7740 
-LYF*TQTLVSV*NYLTFLKH 
-FIFKHKLS^'VFKIT* L F * N I 
LFLNTNSRKCLKLPDFSETS 
7741 - CAAGCGAAAAGGCATCAGATATGTACTCGAAAGTGCAATTAAATGCATXATCGAATATCA - 78 00 
-QAKRHQICTRKCN*MHYRIS 
-KRKGIRYVLESAIKCIIEYH 
SEKASDMYSKVQLNALSNII 
7801 - TAGTATGTGTCTGTGTACCCATGGGTTTAGAAACAGCAAAGAAAGGGTTGTCACACAATT - 7860 
-*YVSVYPWV*KQQRKGCHTI 
-SMCLCTHGFRNSKERVVTQF 
VCVCVPMGIiETAKKGLSHNS 
7861 - CAAAGTTACATGGTCGTATAACAACATTAGTAGAATTGTTAATAATAATCACCGACTGTG - 7920 
-QSYMLV*QH**NC***SPTV 
-KVTCSYNNISRIVNNNHRL* 
KLHARITTLVELLIIITDCD 
7921 - AGTTGTTGTTCATGGTAGAACCAAAAACCCAACCACGGACAACATTTGATTTCTCTGTGG - 7980 
-TCCSW^-NQKPNHGQHLISLW 
-LVVHGRTKNPTTDNI*FliCG 
LLFMVEPKTQPRTTFDFSVA 
7981 - CAGCAAAAl'AAATACCATCCTTAAAAGGTATGACAGGGTTGCCAAACGTATGATTAATAG - 8040 
-QQNKYHP*KV*QGCQTYD* * 
•-SKINTILKRYDRVAKRMINS 
A K * I PSL KGMTGLPNV* L I V 
8041 - TATGAAACCCTGTAACATTAGAATAAAATGGAAGAAATAAATCCTGAGTTAAATAAAGAG - 8100 
-YETL*H*NKMEEINPBLNKE 
-MKPCNIRIKWKK*ILS*IKS 
*NPVTLE*NGRNKS*VK*RV 
8101 - TGTCTGATCTAAAAATTTCATCAGGATAGTAAACCCCCCTCATAGATGAAGTATGTTGAG - 8160 
-CLI*KFHQDSKPPS*MKYVE 
-V*SKNFIRIVNPPHR*SMLS 
SDLKISSG**TPLIDEVC*V 
8161 - TGTAATTAGGAGCTTGAACATCATCAAAAGTGGTGCACCGGTCAAGGTCACTACCACTAG - 8220 
-CN*ELEHHQKWCTGQGHYH* 
-VIRSLNIIKSGAPVKVTTTS 
*LGA*TSSKVVHRSRSLPLV 
8221 - TGAGAGTAAGAAATAATAAGAAAATAAACATGTTCGTTTAGTTGTTAACAAGAATATCAC - 8280 
-*E*EIIRK*TCSFSC*QEYH 
-ESKK**ENKHVRLVVNKNIT 
- RVRNNKKINMFV*LLTRISL 
8281 - TTGAAACCACAACTCTGTTGTTTTCTCTAATGATAAGCCTACCTTTTTCCAGAAGAGAAT - 8340 
-LKPQLCCFL* * * AYLF PEEN 
-*NHNSVVFSNDKPTFFQKRI 
ETTTLLFSLMISLPFSRRE* 
8 341 - AAATCATATCATTGATTTGATTCTCCTTAAGAGACATTACAGCAGTTCCTCTTAATTTAA - 8 400 
-KSYH* FDSP*ETLQQFLLI * 
-NHIIDLILLKRHYSSSS*FK 
IISLI^FSLRDITAVPLNLR 
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• 8401 - GAGGAMTTTGCTCATGTCAAAGAGTGAATAGGAAGACAACTGGATAGGATTTGTGTTCC - 8460 
-EEICSCQRVNRKTTG* DLCS 
-RKFAHVKE*IGRQLDRICVP 
GNLLMSKSE*EDNWIGFVFL 
8461 - TCCAGAAAATGTAGTTAGCATGCATGGTATAGCCATCAATTTGTTCCTTCGGCTTGCCAA - 8520 
-S RKC S * HA W Y S HQ FV E SA C Q 
-FENVVSMHGIA1NLFLRLAK 
Q K M - L A C M V * P S I C S F G L P R 
8521 - GATAGTTAGCCCCAATTAAAAATGCTTCCGATGATGATGCATTTACATTTGTAACAAAAG - 8580 

- D S * PQLKMLPMMMHLHL* QK 
-IVSPN*KCFR**CIYICNKS 

*1APIKNASDDDAFTFVTKA 
8581 - CTGTCCACCATGAGAAATGGCCCATAAGCTTGTAAAGGTCAGCATTCCAAGAATGCTCTG - 8640 
-L3TMRNGP*ACKGQK£'KNAL 
-CPP*EMAHKLVKVSIPRMI>C 
VHHEKWPISL*RSAFQECSV 
8 641 - TTATCTTTACAGCTATAGAACCACCCAGGGCTAGTTTTTGCTTTATAAATCCACACAGAT - 87 00 
-LSLQL^'NHPGIiVFAL* IHTD 
-YLYSYRTTQG* FLLYKSTQ1 
IFTAIEPPRASFCFINPHR* 
8701 - AAGTGAAAAACCCTTCTTTAGAGTCAT'rCTCTTTTGTCACATGTTTGGTCCTAGGGTCAT - 87 60 
-K*KTLLi* SHSLLSHVWS*GH 
-SEKPFFRVILFCHMFGPRVI 
VKNPSL ESFSFVTCLVLGSY 
87 61 - ACATATCGCTAATAATAAGGTCCCArTTATTAGCCGTATGTACTGTTGCACAGTCTCCAA - 8820 
-TYR***G?IY*PYVLLHSLQ 
-HIANNKVPFISRMYCCTVSM 
ISLIIRSHLLAVCTVAQSPI 
8 821 - TTAAAGTAGAATCTGCGTCGGAGACGAAGTCATTAAGATCTGAATCGACAAGTAGTGTGC - 8880 
-LK*NLRRRRSH* DLNRQVVC 
-*SRICVGDEVrKI*IDK*CA 
KVESASETKSLRSESTSSVP 
8881 - CAGTTGGCAACCATTGTCTGAGCACAGCTGTACCTGGTGCAACTCCTTTATCAGAGCCAG - 8940 
-QLATIV*AQLYLVQLLYQSQ 
-SWQPLSEHSCTWCNSFIRAS 
VGNHCLSTAVPGATPLSEPA 
8941 - CACCAAAGTGAATAACTCTCArGTTGTAGGGTACAGCTAAAGTAAGTGTATTTAAGTATT - 9000 
-HQSE*LSCCRVQLK*VYLSI 
-TKVNNSHVVGYS*SKCI*VL 
PK*ITLML*GTAKVSVFKY* 
9001 - GACACAGTTGAGTATACTTTGCGACATTCATCATTATTCCTTTTGGTATAACAGCATTTT - 9060 
-DTVEYTLRHSSLFLLV*QHF 
-TQLSILCDIHHYSFWYNSIF 
HS*VYFATF1IIPFGITAFS 
9061 - CACCATAATTCTGAAGGTCACACTTTTCAAGAAGCATTCTTTGCATCTTGTACAAGTTAG - 9120 
-HHNSEGHTFQEAFFASCTS* 

- T I ILKVTLFKKHSLHLVQVR 

P*F*RSHFSRSILCILYKLG 
9121 - GCATCGCAACACCTGGTTGCCACGCTTGACT1GCTTG1AGTTTTGGGTAGAAGGTTTCAA - 9180 
-ASQHLVATLDLLVVLGRRFQ 
-HRNTWLPRLTCL*FWVEGFN 
3ATPGCHA*LACSFG*KVST 
9181 - GATGTCGATCCTTACACCAAAGCATGAATGAAATTTCAGCATAGTCAATTGTAACCTTGA - 9240 
-HVHPYTKA*MKFQHSQL* P * 
-MSILTPKHE*NFSIVNCNLD 
CPSLHQSMNEISA*SIVTLT 
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9241 - CCACTTTTGAAATCACTGACAAATCTTGTGACTTTATTATCTCGACAAAGTCATCAAGTA - 9300 
-PLLKSLTNLVTLLSRQSHQV 
-BF*NH *QIL*UYYLDKVIK* 
TFEITDKSCDFIISTKSSSK 
9301 - ftAAGATCAATCACAGAACACACACATTTTGATGAACCTGTTTGCGCATCTGTTATGAAGT - 93 60 
-KDQSQNTHILMNLFAHLL*S 
-KINHRTHTF**TCLRICYEV 
RSITEHTHFDEPVCASVMK* 
9361 - AATTTTTCACTGTGCTGTCCAf AGGGATAAAATCCTCTAATTTAAGTGGTGAATCTTGTG - 9420 
-NFSLCCP'<G*NPLI * V V N L V 
I FHCAVHRDKIL* FKW* IL* 
FFTVLSIGIKSSNLSGESCE 
9421 - AGCGCTTGGCTAAGCCTATCATTAAATGAAGACCGCCAAGTTGTCCATGACTGAAATCTC - 9480 
-SAWLSLSLNEDRQVVHD*NL 
-ALG*AYH*MKTAKLSMTEIS 
RLAKPIIK*RPPSCP*LKSP 
9481 - CATAAACGATGTGTTCGAAGGCATAGCCCTCGAGCTTATATCGCTGTATGAATTCATCCA - 9540 
-HKRCVRRHSPRAYIAV* IHP 
-INDVFEGIALELISLYEFIH 



-*RARESQFPFVIWA*NPLSL 
-SELEKVSFHL*SGLKIL*VS 
A53RK5VSICDLGLKSSKSL 
9601 - TGCTCTGAGTAAAGTAGGTTTCAGGCAACTGTTGAATAATGCCGTCTACTTTCTTAAAGT - 9660 
-CSE* 3RFQATVE*CRLLS-*S 
-ALSKVGFRQLLNNAVYFLKV 
L*VK*VSGNC*IMPSTFLK i ' 
9661 - AGTTAAACTGTGTTTTTACTGATTCTCCRATTAATGTGACTCCATTGACGCTAGCTTGTG - 9720 
-S J, TVFLLILQLM*LH*R*LV 
-VKLCFY*FSN*CDSIDASLC 
LNCVFTDSPINVTPLTLACA 
9721 - CTGGTCCCTTTGAAGGTGTTAGACCTTTGACTGAACCTTCTGTXATTAAAACACCATTAC - 9780 
-LVPLKV1DL*LNLLLLKHHY 

- WSL*RC*TFD*TFCY*NTIT 

GPFEGVRPLTEPSVIKTPLR 
97 81 - GGGCGTTTCTAAAAAGGTCTACCTGTCCTTCCACrCTACCATCAAACAAGACAGTAAGTG - 98 4 0 

- G R F * KGLPVLPLYHQTRQ*V 
-GVSKKVYLSFHSTIKQDSK* 

AFLKRST CPSTLPSNKTVSE 
9841 - AAGAACAAGCACTCTCAGTAGGTTTCTTGGCAATGTCAGTCATTGTGCAGACACCTATTG - 9900 
-KNKHSQ*VSWQCQSLCRHLL 
-RTSTLSRFLGNVSHCADTYC 
EQALSVGFLAMSVIVQTPIV 
9901 - TAGATACATGTGCTGGGGCTTCTCrTTTGTAGTCCCAGATTACAGTATTAGCAGCGATAT - 9960 
-* IHVLGLLFCSPRLQY* QRY 
-RYMCWGFSFVVPDYSISSDI 
DTCAGASLL*SQI TVLAAIS 
9961 - CAACACCCAAATTATTGAGTATCTTAATCTCTGGCACTGGTTTAATGTTACGCTTAGCCC - 10020 
-QHPNY*VS*SLALV*CYA*P 

- N T Q I IEYLNLWHWFNVTLSP 

TPKLLSILISGTGLMLRLAQ 
10021 - AAAGCTCAAATGCAACATTAACAGGAAGTGTTGTCTTATTTTCAAAGATCTCCACATCAA - 10080 
-KAQMQH*QEVLSYFQRSPHQ 
-KLKCNINRKCCLIFKDLHIN 

33NA TLTGSVVLFSKISTSI 
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10081 - TACCATCTACCTTTGTGTAAACAGCATTATTAATGATGGAAACAGGTGCTTCGCCGGCGT - 1014 0 
-YHL PLCKQHY* * WKQVLRRR 
-TIYLCVNSIINDGNRCFAGV 
PSTFV*TALLMMETGASPAC 

10141 - GTCCATCAAAGTGTCCTTTATTAACAACATTATAAGCCACATTTTCTAAACTCTGTAACC - 10200 
-VHQSVLY*QHYKPHFLNSVT 
-SIKVSFrNNIISHIF*TL*P 
PSKCPLLTTL*ATFSKLCWL 

10201 - TGGTAAATGTATTCCACAGGTTATAAGTATCAAATTGTTTGTAAATCCATAGGCTAAATC - 10260 
-VJ-*MYSTGYKYQIVCKSIG*I 
-GKCIPQVISIKLFVNP*AKS 
VNVFHRL*VSNCli* IHRLNP 

10261 - CAGCAGAAATCATCATATTATATGCATCCAAGTACTGTCGGTACTCATTTGCATGGTGTC - 10320 
-QQKS SYYMHP STVGTH LHGV 
-SRNHHIICIQVLSVLICMVS 

- AEIIILYASKYCRYSFAWCL 

10321 - TGCAAACAGCACCACCTAAATTGCATCGTGTAATACACGTAGCAGATTTGAGTGGAACAT - 10380 
-CKQHHLNCIV*YT*QI*VEH 

- A N S T T * IASCNTRSRFEWNI 

QTAPPKLHRVIHVADLSGT* 
10381 - AATCAATATCCGACACTACTTGTTTGCCATGAGACTCACAAGGACTATCAGAATAGTAAA - 104 40 
-NQYPTLLVCHETHKDYQNSK 
-INIRHYLFAMRLTRTIRIVK 
SISDTTCLP*DSQGLSE**K 
104 41 - AGAAAGGCAATTGCTTTAAATTAGTAAATGCACTTTTATCGAAAGCTGGAGTGTGGAATG - 10500 
-RKA IAL.N * *MHFYRKLE CGM 
-ERQLL* ISKCTFIESWSVEC 
KGNCFKLVNALLSKAGVWNA 
10501 - CATGCTTATTCACATACAAACTACCACCATCACAGCCTGGTAAGTTCAAGTTTGACAAGA - 10550 
-HAYSHTNYHHHSLVSSSLTR 
-MLIHIQTTTITAW*VQV*QD 
CLFTYKLPPSQPGKFKFDKT 
10561 - CTCTTGTGTCAAACCTACACACAATTGCATTGGCTGGGTAACGATCAACGTTACAAITCC - 10620 
-LLCQTYTQLHWLGNDQRYNS 
-SCVKPTHNCIGWVTINVTIP 

- LVSNLHTIALAG*RSTLQFQ 

10621 - AAAACAAACAAACACCATCAGTGAATTTATCGTGATGTGTAGCATAAGAATAGAAGAGTT - 10630 
-KTNKHHQ*IYRDV*HKNRRV 
-KQTNTISEFIVMCSIRIEEF 
NKQTPSVNLS*CVA*E*KSS 
10 681 - CCTCTATTTTGTAAGCXTTGTCACTACATGGCTGAGCATCGTAGAACTTCCATTCTACTT - 10740 
-PLFCKLCHYMAEHRRTSILIi 
LY FV S FV TTWLS IVEL P FY F 
SIL*ALSLHG*AS*NFHSTS 
10741 - CAGCCTGAGGCACACACTTGATAGCCTTTGGATTTCCAATGTCATGAAGAACTGGAAACT - 10800 
-QPEAHT * * PLDFQCHEELET 
-SLRKTLDSLWISNVMKNWKL 
A*GTHLIAFGFPMS*RTGNL 
10801 - TATCAGCAAGCAATGCAGACTTCACAACCATGTGTTGTACTTTTCTGCAAGCAGAATTAA - 10860 
-YQQAMQTSQPCVVLFCKQN* 
-ISKQCRLHNHVLYFSASRIN 
SASNADFTTMCCTFLQAELT 
10861 - CCCTCAGTTCATCTCCTATAATAGGGTATTCAACAGACCAATCAACGCGCTTAACAAAGC - 10920 
-PSVHLL** GIQQTNQRA*QS 
-PQF1SYNRVFNRPINALNKA 
LSSSPI IGYSTDQSTRLTKH 
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■ ACTCATGGACTGCTAAACATCTAGTCATGATAGCATCACAACTAGCCACATGTGCATTTC - 10980 

■THGLLNI*S**HHN*PHVHF 

-LMDC*TSSHDSITTSHMCIS 

SWTAKHLVMIASQLA-'CAFP 
- CATGTACCTGGCAATGTTGGTCATGGTTACTCTGAAGGTTACCCGTAAAGCCCCACTGCT - 1104 0 
-HVPGNVGHGYSEGYP*SPTA 
• MYLAMLVMVTLKVTRKAPLL 



-EHQS*MGYRHSQNPQNDSSR 
-NINHKWVI DIVKTHRMI PAG 
TSIINGL*T*SKPTE* FQQA 
11101 - CATAAGTATCTGATGAAGTAGAAAAGCAAGTTGCACGTTTGTCACACAGACAACACGTTC - 11160 
-HKYLMK* KSKLHVCHTDNTF 

- ISI**SRKASCTFVTQTTRS 

*VSDEVEKQVARLSHRQHVL 
11161 - TTTCAGGTCCAATCTTGACAAAGTACTTCATTGATGTAAGCTCAAAGCCATGCGCCCAAA - 11220 

- F Q V Q S * QSTSIiM*AQSHAPK 
.- FRSNLDKVLH*CKLKAMRPK 

SGPILTKYFIDVSSKPCAQR 
11221 - GGACGAACACGACTCTGTCTGACAATCCTTTCAGTGTATCACTGAGCATTTGTACTATCT - 11280 
-GRTRLC1T I L S V Y H * A F V L S 

- DEHDSV*QSFQCITEHLYYL 

TNTTLSDNPFSVSLSICTIL 
11281 - TAATACGCACTACATTCCAGGGCAAGCCTTTATACATGAGTGGTATAAGATGTTTAAACT - 11340 

- * YALHS RASLYT * V V * DV* T 
-NTHYIPGQAFIHEWYKMFKL 

IRTTFQGKPLYMSGIRCLNW 
11341 - GGTCACCTGGTGGAGGTTTTGCATTAACTCTGGTGAATTCTGTGTTATTTTCAGTGTCAA - 11400 
-GHLVEVLH*LW*ILCYFQCQ 
-VTWWRFCINSGEFCVIFSVN 
SPGGGFALTLVNSVLFSVST 
11401 - CATAACCAGTCGGTACAGCTAC'1'AAGTTAACACCTGTAGAAAATCCTAGCTGGAGAGGTA - 11460 
-HNQSVQLLS *HL*KI LAGEV 

- ITSRYSY i VNTCRKS*LER* 

*PVGTATKLTPVENPSWRGR 
114 61 - GGTTAGTACCCACAGCATCTCTAGTTGCATGACAGCCCTCTACATCAAAGCCAATCCACG - 11520 
-G*YPQHL*LHDSPLHQSQST 
-VSTHSISSCMTALYIKANPR 
LVPTASLVA*QPSTSKPIHA 
11521 - CACGAACGTGACGAATAGCTTCTTCGCGGGTGATAAACATATTAGGGTAACCATTGACTT - 11580 
-HERDE*LLRG* *TY*GNH*L 
-TNVTNSFFAGDKHIRVTIDL 
RT*RIASSRVINILG*PLTW 
11581 - GGTAATTCATTTTGAAACCCATCATAGAGATGAGTCTACGGTAGGTCATGTCCTTTGGTA - 11640 
-GNSF*NES*R*VYGRSCPLV 
-VIHFETHHRDESTVGHVLWY 
*FILKPIIEMSLR*VMSFGM 
11641 - TGCCTGGTATGTCAACACATAATCCTTCAGTCTTGAATTTTATATCAACGCTGAGGTGTG - 11700 
-CLVCQHIILQS*ILYQR*GV 
-AWYVNT*SFSLEFYINAEVC 
PGMSTHNPSVLNFISTLRCV 
11701 - TAGGTGCCTGTGTAGGATGAAGACCAGTAATGATCTTACTACAGTCCTTAAAAAGTCCAG - 11760 
-*VPV*DEDQ**SYYSP*KVQ 
-RCLCRMKTSNDLTTVLKKSS 
G A C V G * R P V M I L L Q S L K S P V 
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11761 - TTACATTTTCTGCTTGTAATGTAGCCACATTGCGACGTGGTATTTCTAGACTTGTAAATT - 11820 

-lhfllvm* p n c d v v f l d l * i 

- Y I F C I 4 C S H I A T W Y F * T C K L 
TFSACNVATLRRGISRLVNC 
11821 - GCAGTTTGTCATAAAGATCTCTATCAGACATTATGCACAAAATGCCAATTTTTGCCCTTG - 11880 
-AVCHXDLYQTLCTKCOF1PL 
-QFVIKISI RHYAQNANFCPC 



-**PH*SG*HYKSVLFQ*FV* 
-DSHIEAVDITRVCCFSSLCE 
IATLKRLTLQECAVSVVCVN 
11941 - ATATGACATAGTCATATTCAGAACCCTGTGATGAATCAACAGTCTGCGTAGGCAATCCTA - 12000 
-I*HSHIQNPVMNQQSA*AIL 
-YDIVIFRTL**INSLRRQS* 
MT*SYSEPCDESTVCVGNPK 
12001 - AGATTTTTGAAGCTACAGCGTTCTGTGAATTATAAGGTGAGATAAAAACAGCTTTTCTCC - 12060 
-RFLKLQRSVNYKVR*KQLFS 
-DF J 'SYSVL J 'IIR*DKNSFSP 
IFEATAFCEL*GEIKTAFLQ 
12061 - AAGCAGGATTGCGTGTAAGAAATTCTCTTACAACGCCTATTTGAGGTCTGTTGATTGCAG - 12120 
-KQDCV*EILLQRLFEVC*LQ 
-S. RIACKKF SYNAYLRSVDCR 
AGLRVRNSLTTPI*GLLIAD 
12121 - ATGAAACATCATGTGTAATAACACCTTTGTAGAACATTTTGAAGCATTGAGCTGACTTAT - 12180 
-MKHHV**HLCRTF*SIELTY 
- * NIMCNNTFVEHFEALS*LI 
ETSCVITPL*NILKH*ADLS 
12181 - CCTTGTGTGCTTTTAGCTTATTGTCATAAACTAAAGCACTCACAGTGTCAACAATrTCAG - 12240 
-PCVLLAYCHKLKHSQCQQFQ 
-LVCF*LIVIN*STHSVNNFS 
LCAFStiLS*TKALTVSTISA 
12241 - CAGGACAACGGCGACAAGTTCCAAGGAACATGTCTGGACCTATTGTTTTCATAAGTCTGC - 12300 
-QDNGDKFQGTCLDLLFS*VC 
-RTTATSSKEHVWTYCFHKSA 
GQRRQVPRKMSGPIVF1SLH 
12301 - ACACTGAATTAAAATATTCTGGTTCTAGTGTGCCTTTAGTCAGCAATGTGCGGGGGGCTG - 12360 
-TLN + NILVLVCL*SAMCGGL 
-H*IKIFWF*CAFSQQCAGGW 
TELKYSGSSVPLVSNVRGAG 
12361 - GTAATTGAGCAGGATCGCCAATATAGACGTAGTGTTTTGCACGAAGTCTAGCATTGACAA - 12420 
-VIEQDRQYRRSVLHEV*H*Q 
-*LSRIANIDVVFCTKSSIDN 
N*AGSPI*T*CFARSLALTT 
12421 - CACTCAAGTCATAATTAGTAGCCATAGAGATTTCATCAAAGACTACAATGTCAGCAGTTG - 12480 
-HSSHN**P*RFHQRLQCQQL 
-TQVI ISSHRDFIKDYNVSSC 
LKS*LVAIEISSKTTMSAVV 
12481 - TTTCTGGCAATGCATTTACAGTGCAGAAAACATACTGITCTAGTGTTGAATTCACTTTGA - 12540 
-FLAMHLQCRKHTV1VLNSL* 
-FWQCIYSAENILF*C* IHFE 
SGNAFTVQKTYCSSVEFTLN 
12541 - ATTTATCAAAACACTCTACGCGCGCACGCGCAGGTATGATTCTACTACATTTATCTATGG - 12S00 
-I YQNTLRAHAQV* FYYIYLW 
-FIKTLYARTRRYDSTTFIYG 
LSKHSTRARAGMIIiLHLSMG 
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12601 - GCAAATATTT1AATGCCTTTTCACATAGGGCATCAACAGCTGCATGAGAGCATGCCGTAT - 12660 
-ANILMPFHIGHQQIiHESMPY 
-QIF*CLFT*GINSCMRACRI 
KY FNAFSHRASTAA* E HAVY 
12661 - ACACTATSCGAGCAGATGGGTAATAGAGAGCAAGTCCGATGGCAAAATGACTCTTACCAC - 12720 
-TLCEQMGNREQVRWQNDSY Q 
-HYASRWVIESKSDGKMTLTS 
TMRADG**RASPMAK*LLPV 
12721 - TACCAGGTGGTCCTTGGAGTGTAGAGTACTTTTGCATGCCGACCTTTTGATAATTTGCAA - 12780 
-YQVVLGV*STFACRPFDNLQ 
-TRWSLECRVLLHADLLIICN 
PGGPWSVEYFCMPTF* *FAT 
12781 - CATTGCTAGAAAACTCATCTGAGATGTTGAGTGTTGGGTACAAGCCAGTAATTCTCACAT - 12840 
-HC*KTHLRC*VLGTSQ*FSH 

- IARKLI*DVECWVQASNSHI 

LLENSSEMLSVGYKPVILT* 
12841 - AGTGCTCTTGTGGCACTAGAGTAGGTGCACTAAGTGGCATTACAGTGTGAGATGTCAACA - 12900 
-SALVALE*VH*VALQCEMST 
-VLLWH*SRCTKWHYSVRCQH 
CSCGTRVGALSGITV* DVNT 
12901 - CAAAGTAATCACCAACATTCAACTTGTATGTCGTAGTACCTCTGTACACAACAGCATCAC - 12960 
-QSNHQHSTCMS^YLCTQQHH 
-KVITNIQLVCRSTSVHNSIT 
K*SPTFNL¥VVVPLYTTASP 
12961 - CATAGTCACCTTTTTCAAAGGTGTACTCTCCAATCTGTACTTTACTATTTTTAGTTACAC - 13020 
-HSHLFQRCTLQSVLYYF*LH 
-IVTFFKGVLSNLYFTIFSYT 
*SPFSKV YSPICTLLFLVTR 
13021 - GGTAACCAGrAAAGACATAGTTTCTGTTCAATGGTGGTCTAGGTTTTCCAACCTCCCATG - 13080 
-GNQ*RHSFCSMVV*VFQPPM 
-VTSKDIVSVQWWSRFSNLP* 
*PVKT*FLFNGGLGFPTSHE 
13081 - AAAGATGCAATTCTCTGTCAGAGAGTACTTCGCGTACAGTGGCAATACCATATGACAGCT - 13140 
-KDAILCQRVLRVQWQYHMTA 
-KMQFSVREYFAYSGNTI*QL 
RCNSLSESTSRTVAIPYDSL 
13141 - TAAATGTTTCCTCAGTGGCT TTGAGCGTrTCTGCTGCGAAAAGCTTGAGTCTCTCAGTAC - 13200 
-*MFPQWL*AFLLRKA*VSQY 
-KCFLSGFERFCCEKLESLST 
NVSSVALSVSAAKSLSLSVQ 
13201 - AAGTGTTGGCAAGTATGTAATCGCCAGCATTAGTCCAATCACATGTTGCTATCGCATTGA - 13260 
-KCWQVCNRQH*' SNHMLLSH* 
-SVGKYVIASISPITCCYRIE 
VLASM*SPALVQSHVAIALK 
13261 - AGTCAGTGACATTGTCACTGCCTACACATGTGTTTTTGTATAAACCAAAAACCTGACCAT - 13320 
-SQ*HCHCLHMCFCINQKPDH 

- V S DIVTAYTCVFV* TKNLTI 

SVTLSLPTHVFLYKPKT*PL 
13321 - TAGCACATAATGGAAAACTAATGGGAGGCTTATGTGACTTGCAATAATAGCTCATACCTC - 13380 
-*HIMEN*WEAYVTCNNSSY1> 
-ST*WKTNGRLM*LAIIAHTS 
AHNGKLMGGLCDLQ**LIPP 
13381 - CTAGATACAGTTGTGTCACATCAGTGACATCACAACCTGGGGCATTGCAAACATAGGGAT - 13440 

- L D T V V S H Q *• H H N L G H C K H R D 

- * I QLCHISDITTHGIANIGI 

RY3CVTSVTSQPGALQT"'GL 
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134 41 - TAACAGACAACACTAATTTGTGTGATGTTGAAATGACATGGTCATAGCAGCACTTGCAAC - 13500 
-*QTTLICVMLK*HGH SSTCN 

- N R Q H * FV*C*NDMV I A A L A T 

TDNTNLCDVEMTWS*QHLQH 
13501 - ATAGGAATGGTCTCCTAATACAGGCACCGCAACGAAGTGAAGTCTGTGAATTGCACAATA - 13560 

- I 3 M V S * Y R H R N E V K S V N C T I 
-*EWSPNTGTATK*SL* IAQY 

RNGLLIQAPQRSEVCELHNT 
13561 - CACAAGCACCTACAGCCTGCAAGACTGTATGrGGTGTGTACATAGCCTCATAAAACTCAG - 13620 
-HKHLQPARLYVVCT* PHKTQ 
-TSTYSLQDCMWCVHSLIKLR 
QAPTACKTVCGVYIAS*NSG 
13621 - GTTCCCAGTACC3TGAGGTGTTATCATTAGTTAGCATTACGGAATACATGTCCAACATGT - 13680 
-VPSTVRCYH*LALRNTCPTC 
-FPVP*GVIIS*HYGIHVQHV 
SQYREVLSLVSITEYMSNMW 
13681 - GGCCAGTAAGCTCATCATGTAACTTTCTAATGTATTGTAAATACAAGTGAAAGACATCAG - 13740 
-GQ*AHHVTF*CIVNTSERHQ 
-ASKLIM*LSNVL* IQVKDIS 
PVSSSCNFLMYCKYK*KTSA 
137 41 - CATACTCCTGATTAGGATGTTTTGTAAGTGGGTAAGCATCAATAGCCAGTGACACGAACC - 13800 
-HTP D* DVL*VGKHQ* PVTRT 
-ILLIRMFCK»VSINSQ*HEP 
Y5*-LGCFVSG*ASIASDTNI J 
13801 - TTTCAATCATAAGTGTACCATCTGTTTTGACAATATCATCGACAAAACAGCCTGCGCCTA - 13860 

- F Q S * VYHLF*QYHRQNSLRL 
-FNHKCTICFDNIIDKTACA* 

SI ISVPSVLTISSTKQPAPN 
13861 - ATATTCTTGATGGATCTGGGTAAGGCAGGTACACGTAATCATCTCCTTGTTTAACTAGCA - 13920 
-IFLMDLGKAGTRNHLLV*LA 
-YS*WIWVRQVHVIISLFN*H 
ILDGSG*GRYT*SSPCIiTSI 
13921 - TTGTATGCTGTGAGCAAAATTCGTGAGGTCCTTTAGTAAGGTCAGTCTCAGTCCAACATT - 13980 
-LYAVSKIREVL**GQSQSNI 
-CML*AKFVRSFSKVSLSPTF 
VCCEQNS*GPLVRSVSVQHF 
13981 - TTGCCTCAGACATGAACACATTATTTTGATAATAAAGAACTGCCTTAAAGTTCTTAATGC - 14040 
-LPQT*THYFDNKELP*SS*C 
-CLRHEHIILIIKNCLKVLNA 
ASDMNTLF***RTALKFLML 
14041 - TAGCTACTAAACCTTGAGCCGCATAGTTACTGTTATAGCACACAACGGCATCATCAGAAA - 14100 
-*IiLNLEPHSYCYSTQRHHQK 
-SY*TLSRIVTVIAHNGIIRK 
ATKP*AA*LLL*HTTASSER 
14101 - GAATCATCATGGAGAAATGTTTACGCAGGTAAGCGTAAAACTCATCCACGAATTCATGAT - 14160 
-ESSWRNVYAGKRKTHPRIHD 
-NHHGEMFTQVSVKLIHEFMI 
IIMEKCI,RR*A*NSSTNS*S 
14161 - CAACATCCCTATTTCTATAGAGACACTCATAGAGCCTGTGTTGTAGATTGCGGACATACT - 14220 
-QHPYFYRDTHRACVVDCGHT 
-NIPISIETLIEPVL*IADIL 
TSLFL*RHS* 'SLCCRLRTYL 
14221 - TGTCAGCTATCTTATTACCATCAGTTGAAAGAAGTGCATTTACATTGGCTGTAACAGCTT - 14280 
-CQ1S Y YH QLKEVHLHW1* Q 1 
-VSYLITIS*KKCIYIGCNSL 
SAILLPSVERSAFTLAVTA* 
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14281 - GACAAATGTTAAAGACACTATTAGCATAAGCAGTTGTAGCATCACCGGATGATGTTCCAC - 14340 
-DKC * R H Y * HKQL* HHRMMFH 
-TNVKDTISISSCSITG*CST 
Q M L K T L L A * A V V A S P D D V P P 
14341 - CTGGTTTAACATATAGTGAGCCGCCACACATGACCATCTCACTTAATACTTGCGCACACT - 14400 
-LV*HIVSRHT* PSHLILAHT 
-WFNI**AATHDHLT*YLRTL 
GLTYSEPPHMTISLNTCAHS 
14401 - CGTTAGCTAACCTGTAGAAACGGTGTGATAAGTTACAGCAAGTGTTATGTTTGCGAGCAA - 14460 

- R * LTCRNGVISYSKCYVCEQ 
-VS*PVETV**VTASVMFASK 

LANL*KRCDKLQQVLCLRAR 
14461 - GAACAAGAGAGGCCATTATCCTAAGCATGTTAGGCATGGCTCTGTCACATTTTGGATAAT - 14520 
-EQERPLS * A C * AW LCH I LDN 
-NKRGHYPKHVRHGSVTFWII 
TREAIILSMLGMALSHFG*S 
14521 - CCCAACCCATAAGGTGTGGAGTTTCTACATCACTGTAAACAGTTTTTAACATATTATGCC - 14580 
-PNP*GVEFLHHCKQFLTYYA 
-PTHKVWSFYITVNSF*HIMP 
QPIRCGVSTSL*TVFNILCQ 
14581 - AGCCACCGTAAAACTTGCTTGTTCCAATTACCACAGTAGCTCCTCTAGl'GGCGGCTATTG - 14640 
-SHRKTCLFQLPQ*LL*WRLL 
-ATVKLACSNYHSSSSSGGY* 
PP*NLLVPITTVAPLVAAID 
14641 - ACTTCAATAATTTCTGATGAAACTGTCTArTTGTCATAGTACTACAGATAGAGACACCAG - 14700 
-TSIISDETVYLS*YYR*RHQ 
-LQ*FLMKLSICHSTTDRDTS 
FNNF**NCLFVIVLQIETPA 
14701 - CTACGGTGCGAGCTCTATTCTTTGCACTAATGGCATACTTAAGATTCArTTGAGTTATAG - 14760 
-LRCELYSLH*WHT* DS FEL* 
-YGASSILCTNGILKIHLSYS 
TVRALFFALMAYLRFI * V I V 
147 61 - TAGGGATGACATTACGCTTAGTATACGCGAAAAGTGCATCTTGATCCTCATAACTCATTG - 14820 
-*G*HYA*YTRKVHLDPHNSL 
-RDDITLSIREKCILILITH* 
GMTLRLVYAKSAS*SS*LIE 
14821 - AGTCATAATAAAGTCTAGCCTTACCCCATTTATTAAATGGGAAACCAGCTGATTTATCCA - 14880 
-SHNKV*PYPIY*MGNQLIYP 
-VIIKSSLTPFIKWETS*FIQ 
S**SLALPHLLNGKPADLSR 
14881 - GATTGTTAACGATTACTTGGTTGGCATTAATACAGCCACCATCGTAACAATCAAAGTATT - 14 940 
"DC* RLLGWH* YSHHRNNQS I 
-IVNDYLVGINTATIVTIKVF 
LLTITWLALIQPPS*QSKYL 
14941 - TATCAACAACTTCAACTACGAATAGGAGTTGTCTGATATCACACATTG'JTGGCAGATTAT - 15000 
-YQQLQLRIGVV*YHTLLADY 

- INNFNYE*ELSDITBCWQII 

STTSTTNRSCLISHIVGRL* 
15001 - AACGATAATAGTCATAATCACTGATAGCAGCGTTGCCATCCTGAGCAAAGAAGAAGrGTT - 15060 
-NDNSHNH* *QRCHPEQRRSV 
-TIIVIITDSSVAILSKEEVF 
R**S*SLIAALPS*AKKKCF 
15061 - TTAGTTCAACAGAACTTCCTTCCTTAAAGAAACCTTTAGACACAGCAAAGXCATAAAAGT - 15120 
-LVQQNFLP * R N L * TQQSHKS 
-*FNRTSFLKETFRHSKVIKV 
SSTELPSLKKPLDTAKS*KS 
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15121 - CTTTftTTAflAATTACCGGGTTTGACAGTTrGAAfiAGCAACATTGTTTGTTAGTGCAGCTA - 15180 

- L Y * N Y R V * QFEKQHCLLVQL 

- FIKITGFDSLKSNIVC*CSY 

LLKLPGLTV* KATLFVSAAT 
15181 - CTGAAAAGCATGTAGTGCGTTTATCTAGCAATAAATTGCCAGAAGCTGCATGCATAGCTG - 15240 
-LKSM*CVYLAINCQKL>HA*L 

- 1 K A C S A F I * Q * I A R S C M H S i'J 

EKHVVRLSSNKLPEAACIAG 
15241 - GATCAGCAGCATACACTAAAAGTTCCTTGAAACTGAGACGCGAGCTATGTAAGTTTACAT - 15300 
-DQQHTLKVP*N*DASYVSLH 

- ISSIH i KFLETETRAM*VYI 

SAAYTKSSLKLRRELCKFTS 
15301 - CCTGATTATG'I'ACGACTCCTAACTCACGAAAATGGTATCCAGTTGAAACAACAAAAGGAA - 15360 
-PDYVRLLTHENGIQLKQQKE 
-LIMYDS*LTKMVSS*NNKRN 
*LCTTPNSRKWYPVETTKGT 
15361 - CACCATCTACAAATATTTTTCTTACTAGTGGTCCAAAACTTGTAGGTGGAAACACAGTAG - 15420 
-HHLQIFFLLVVQNL*VETQ* 
-TIYKYFSY^KSKTCRWKHSR 
PSTNIFLTSGPKLVGGNTVE 
15421 - AAAATAACACATTAAAGTTTGCACAATGAAGGATACACCTATCATCCAAACAGTTAATAC - 15480 
-KITK*SLHNEGYTYHPNS*Y 
-K*HIKVCTMKDTPIIQTVNT 
NNTLKFAQ*RIHLSSKQLIQ 
15481 - AATTGGGATGGTATGTCTGGTCCCAATATTTAAAATAACGGTCGAAGAGACAAAGTCTCT - 15540 
-NW DGMSGPNI *NNGRRDKVS 

- IGMVCLVPIFKITVEETKSL 

- LGWYVWSQYLK*RSKRQSLS 

15541 - CTTCCGTAAAATCATATTTCAGCAAATCCCACTTAATAAGTGGTTTTGCGAGATCAGCAT - 15600 
-LP *NHISANPT* *VVLRDQH 
-FRKIIFQQIPLNKWFCEISI 
SVKSYFSKSHLISGFARSAS 

15601 - CCATATGGGACTCAGCAGCCAATGCCCTAGTCAAAGTGAGGATGGGCATCAGCAATGAGT - 15S60 
-PYGTQQPMP*SK*GWASAME 
-HMGLSSQCPSQSEDGHQQ*V 
IWDSAANALVKVRMGISNE* 

15661 - AATATGAATCCACAATAGGAACTCCGCAGCCTGGTGCTACTTGTACGAAATCACCGAAAT - 15720 
-NMNPQ*ELRSLVLLVRNHRN 

- I * IHNRNSAAWCYLYEITEI 

YESTIGTPQPGATCTKSPKS 
15721 - CGTACCAGTTCCCATTAAGATCCTGATTATCTAATGTCAGTACGCCTACAATGCCTGCAT - 157 80 
-RTSSH*DPDYLMSVRLQCLH 
-VPVPIKILII*CQYAYNACI 
YQFPLRS *LSNVSTPTMPAS 
15781 - CACGCATAGCATCGCAGAATTGTACAGTCTTTAATAATGATTGGCGTACACGCTCACCTA - 158 40 
-HA* HRRIVQSLIMIGVHAHL 
-THSIAEIiYSL* **LAYTLT* 
RIASQNCTVFNNDWRTRSPK 
15841 - AGTTAGCATATACGCGTAAGATGTCAGGATTCTCTACGAAGTCATACCAATCCTTCTTAT - 15900 
-S*HIRVRCQDSLRSHTNPSY 
-VSIYA*DVRILYEVIPILLI 
LAYTRKMSGFSTKSYQSFLL 
15901 - TGAAATAATCATCATCACAGCAATTGTATGTGACGAGTATTTCTTTTAATGTATCACAAT - 15960 
-* NNHHHSNCM*RVFLLMYHN 
-EIIIITAIVCDEYFF*CITI 



FIG. 12 Con't 



WO 2004/085650 



PCT/CN2004/000246 



74/106 



15961 - TACCCTCATCAAAATGACGTAGAGCATAGACTAAATCAGCCATTGTGTATTTAGTTAGAC - 
-YPHQNDVEHRLNQPLCI*LD 
-TLIKMT*SID*ISHCVFS*T 
?S3K*RRA*TKSAIVYLVRR 

16021 - GCTGACGTGATATATGTGGTACCATGTCACCATCTACTCTAAACTTGAAAAAGTCATGGA - 

- P. D V I Y V V P C H H Ii Ii * T * K S H G 
-LT*YMWYHVTIYSKLEKVMD 

L K K S W T 

:ttcatgttgg^ 

-qqpldnl*psyk*slhvgs* 
-snrwtifnqvinslfmlvvr 
atagqsltkl*ivsscw*ld 

16141 - ACATAGTATGCCTCTTAACTACAAAGTAAGAGTCTAATAAATTGCCTTCCTCATCCTTCT - 16200 
-T*YAS*LQSKSLINCLPHPS 
-HSMPLNYKVRV**IAFIiILIi 
IVCLLTTK 4 ESNKLPSSSFS 
16201 - CCTGGAAGCGACAGCAATTAGTTTTTAGGAACTTTGCAAAACCAGCACTTTTTTCGTTGT - 16260 
-PGSDSN*FLGTLQNQHFFRC 
-LEATAISF*ELCKTSTFFVV 
WKRQQLVFRNFAKPALFSL* 
16261 - AAATATCAAAAGCCCTGTAGACGACATCAGTACTAGTGCCTGTGCCGCACGGTGTAAGAC - 16320 
-KYQKPCRRHQY*CLCRTV*D 
-NIKSPVDDISTSACAARCKT 
ISKAL*TTSVLVPVPHGVRR 
16321 - GGGCTGCACTTACACCGCAAACCCGTTTAAAAACGTTGATGCATCCGCAGACTGCATCAA - 16380 
-GLHLHRKPV* KR* CIRRLHQ 
-GCTYTANPFKNVDASADCIK 
AALTPQTRLKTLMHPQTASR 
16381 - GGGTTCGCGGAGTTGGTCACAACTACAGCCATAACCTTTCCACATTCCGCAGACGGTACA - 16440 
-GFAELVTTTAITFPHSADGT 
-GSRSWSQLQP-PFHIPQTVQ 
VRGVGHNYSHNLSTFRRRYR 
16441 - GACTGTGTTTCTAAGTGTAAAACCCACTGGGTCATTAGCACAAGTGGTAGGTATTTGGAC - 16500 
-DCVSKCKTHWVISTSGRYLD 
-TVFLSVKPTGSLAQVVGIWT 
LCF*V*NPLGH*HKW 1 *VFGR 
16501 - GTACTTACCTTTCAAGTCACAGAATCCTTTAGGATTTGGATGGTCAATGTGGCATCTACA - 16560 
-VLTFQVTESFRIWMVNVAST 

- YLPFKSQNPLGFGWSMWHLQ 

TYLSSH'RIL^DLDGQCGIYN 
16561 - ATACAGACAACATGAAGCACCACCAAAGGACTCTTGGTCCATGTTAGCTTCTGGTGTTAC - 16620 
-I QTT * STTKGLLVHVS FWCY 
-YRQHEAPPKDSWSM1ASGVT 
TDNMKHHQRTLGPC*LLVLQ 
16621 - AGTAATTGCCTGTCCTGTACCAGTGTGTGTACACAACATCTTCACACAGTTGGTGATTGG - 16680 
-SNCLSCTSVCTQHLHTVGDW 
-VIACPVPVCVHNIFTQLVIG 
*LPVLYQCVYTTSSHSW*LV 
16681 - TTGTCCTCCACTTGCTAGGTAATCCTTATATGCTTTAGCAGGGTCTACTGCAAAAGCACA - 16740 
-LSSTC*VILICFSRVYCKST 
-CPPLAR*SLYALAGSTAKAQ 
VLHLLGNPYML*QGLIiQKHR 
16741 - GAAGGAAAGCACAGTTGAATTGGCAGGTACTTCTGTAGCATTTCCAGCCTGAAGACGTAC - 16800 
-EGKHS*1GRYFCSISSI)KTY 
-KESTVELAGTSVAFPA*RRT 
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16801 - TGTAGCRGCTAAACTGCCCAGCACCflTACCTCTATTTAGGrTGTTTAAGCCTTTGATGAA - 16860 
-CSS*TAQHHTSI*VV*AFDE 
-VAAKLPSTIPLFRLFKPLMK 

* Q L N C P A P Y L Y L G C I, s L * - 3 

168 61 - GTACAAGTATTTCACTTTAGGCCCTTTTGGTGTGTCTGTAACAAACCTACAAGGTGGTTC - 16920 
-VQVFHFRPFWCVCNKPTRWF 
-YKYFTLGPFGVSVTNLQGGS 
TSISL*AIiI»VCIi*QTYKVVP 

16921 - CAGTTCTGTGTAAATTGTACCTGTACCATCACTCTTAGGGAATCTAGCCCATTTGAGATC - 16980 
-QFCVNCTCTITLRESSPFEI 

- S S V * I VPVPSLLGNLAHLRS 

VLCKLYLYHHS' i GI*PI*DL 
16981 - TTGGTGGTCTGATAGTAATGCCAGCACAAACCTACCTCCCTTCGAATTGTTATAGTAGGC - 17040 

- L V V * * *CQHKPTSLRIVIVG 
-WWSDSNASTNLPPFELL*' I 'A 

GGLIVMPAQTYLPSNCYSRQ 
17041 - AAGTGCATTGTCATCAGTACAAGCTGTTIGXGTGGTACCAGCCGCACAGGACATCTGTCG - 17100 
-KCIVI STSCLCGTSRTGHLS 

- SALSSVQAVCVVPAAQDICR 

VHCHQYKLFVWYQPHRTSVV 
17101 - TAGTGCTACTGGACTCAGTTCATTATTCTGTAGTT'TAACAGCTGAGTTGGCTCTTAGAGC - 17160 
-*CYWTQFIIL*FNS*VGS*S 
-SATGLSSLFCSLTAELALRA 
VLLDSVHYSVV*QLSWLLEL 
17161 - TGTAACAATAAGAGGCCAAGCCAAATTTGGTGAArTGTCCATGTTAATTTCACTAAGTTG - 17220 
-CNNKRPSQIW*IVHVNFTKL 
-VTIRGQAKFGEL3MLI 3 L S * 
*Q*EAKPNLVNCPC* F H * V E 
17221 - AACAATCTTGCTATCCGCATCAACAACTTGCTGGATTTCCCAGAGTGCAGATGCATATGT - 17280 
-NNLAIRINNLLDFPECRCIC 
-TILLSASTTCWISQSADAYV 
QSCYPHQQLAGFPRVQMHM* 
17281 - AAAGGTGTTACCATCACAAGTGTTCTTGTAGGTACCATAATCAGGGACAACAACCATGAG - 17340 
-KGVTITSVLVGTIIRDNNHE 
-KVLPSQVFL*VP*SGTTTMS 
RCYHHKCSCRYHNQGQQP*V 
17341 - TTTGGCTGCTGTAGTCAATGGTATGA'rGTTGAGTGGAACACAACCATCACGCGCATTGTT - 17400 
-FGCCSQWYDVEWNTTITRIV 
-LAAVVNGMMLSGTQPSRALL 
WLL*SMV*C*VEHNHHAHC* 
17401 - GATAATGTTGTTAAGTGCATCATTATCAAGCTTCCTAAGCATAGTGAAGAGCATTGTTTG - 17460 
-DNVVKCI IIKLPKHSEEHCL 
-IMLLSASLSSFLSIVKSIVC 
*CC*VHHYQAS*A**RAIiFA 
17461 - CATAGCACTAGTTACTTTTGCCCTCTTGTCCTCAGATCTTGCCTGTTTGTACATTTGGGT - 17520 
-HSTSYFCPLVLRSCLFVHLG 

- IALVT FALLSSDLACLYIWV 

*H*LLLPSCPQILPVCTFGS 
17521 - CATAGCCTGATCTGCCATCTTTTCCAACTTGCGTTGCATGGCAGCATCACGGTCAAACTC - 17580 
-HSLICHLFQLALHGSITVKL 

- IA*SAIFSNLRCMAASRSNS 

* PDLPSFPTCVAWQHHGQTQ 

17581 - AGATTTAGCCACATTCAAAGATTTCTTTAACTTTTTGAGAACGACTTCAGAATCACCATT - 17640 
-RFSHIQRFL*LFENDFRITI 
-DLATFKDFFNFLRTTSESPL 
I *PHSKISLTF*ERLQNHH* 
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17 641 - AGCTACAGCCTGCTCATAGGCCTCCTGGGCAGTGGCATAAGCGGCATATGATGGTAAAGA - 17700 
-SYSLLIGLLGSGISGI* W*R 
-ATACS*ASWAVA*AAYDGKE 

- LQPAHRPPGQWHKRHMMVKN 

17701 - ACTRAATTCTGAAGCAATAGCCTGAAGAGTAGCACGGTTATCGAGCATTTCCTCGCACAA - 17760 

- T K F * 3NS LKSSTV IEHFLAQ 
-LNSEAIA*RVARLSSISSHN 

■ ILKQ* PEE * HGYRAFPR TT 
17761 - CCTATTAATGTCTACAGCACCCTGCATGGATAGCAAAACAGACAAAAGAGAAACCATCTT - 17820 
-PINVYSTLHG*QNRQKRNHL 
-LLMSTAPCMDSKTDKRETIF 
Y*CLQHPAWJAKQTKEKPS3 
17821 - CTCGAAAGCrTCAGTTGTGTCTTTTGCAAGAAGAATATCATTGTGGAGTTC-TACACATTG - 17880 
-LESFSCVFCKKNIIVEL.YTL 
-SKASVVSFARRISLWSCTHC 
RKLQLCLLQEEYHCC-VVHIV 
17881 - TGCCCACAATTTAGAAGATGACTCTACTCTAAGTTGTTGAAGAACCGAGAGCAGTACCAC - 17940 
-CPQFRR* LYSKLLKMREQYH 
-AHNLEDDSTLSC*RTESSTT 
PTI i KMTLL*VVEEPRAVPQ 



-DVHFTSDILDCTVATLIHGL 
MCTLRQTF*TVQ*QP*YMVY 
18 001 - ACCTCCAATACCCAACAACTTAATGTTAAGCTTGAAAGCATCAATACTACTCTTAGGAGG - 18060 
-TSNTQQLNVKIjESINTTLRR 

- PPIPNNLMLSLKASILLLGG 

LQYPTT*C*A*KHQYYS*EA 
18061 - CAAAAGCCCCTGGGAGTTCATATACCTAMTTCTTGTGTAGAGACCAAGTAGTCATAAAC - 18120 
-QKPLGVH I PKFLCRDQVVIN 
-KSP WEFIYLNSCVETK*S*T 
KAPGSSYT*ILV + RPSSHKH 
18121 - ACCAAGAGTAAGCCTGAAGTAACGGTTGAGTAAACAGAAAAGGCCAAAGTAGCAGCAGCA - 181130 
-TKSKPEVTVE*IEKAKVAAA 
-PRVSLK*RLSKQKRPK*QQQ 
] G * V N R K G 
lAACAAGCATGATACACTGTAAC 
-TIA*ETINKHDTL*GVASNK 
-Q*PKKQ*TSMIHCKVLPVIN 

- NSLRNNKQA*YTVRCCQ* * I 

18241 - TAACAATGGGTAATACTCAACACACACAAACACTATAGCTCTAGCTAAAAACATGATAGT - 18300 

- * QWV ILNTHKHYSSS * KHDS 
-NNG* YSTHTNTIALAKNMIV 

TMGNTQHTQTL i L*LKT** S 
18301 - CGTAACGACACCAGAATAGTTAGAGGTTACAGAAATAACTAAGGCCCACATGGAAATAGC - 18360 
-RNDTRIVRGYRNN*GPHGNS 
-VTTPE*LEVTEITKAHMEIA 
*RHQNS*RLQK*LRPTWK*L 
18361 - TTGATCTAAAGCATTACCATAGTAGACTTTGTAAACAAGTGTAATGACATTCATCAGTGT - 18420 
-LI*SITIVDFVNKCNDIHQC 
-*SKALP* *TL*TSVMTFISV 
K H Y H S R 
iCGTCTAGCAGCATCATCAl 
PNTSSSII INSASCHENKQN 
QTRLAASS*TVRAVMRISKT 
KHV*QHHHKQCELS*E*AK1 
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18 481 - TAAAGCTGAAGCATACATAACACAATCCTTAAGCCTATAACCAGACMGCTAGTGTCAGC - 18540 
-*S*SIHNTILKPITRQASVS 
-KAEAYITQSLSL*PDKLVSA 
KLKHT*HNP*AYNQTS*CQP 

18541 - CAATTCAAGCCATGTCATGATACGCATCACCCAGCTAGCAGGCATGTAGACCATATTAAA - 18600 
-QFKPCHDTHHPASRHVDHIK 
-NSSHVMIRITQLAGM 'TILK 
IQAMS*YASPS*QACR?Y*S 

18601 - GTAAGCAACTGTTGCAAGAGAAGGTAACAGAAACAAGCACAAGAATGCGTGCTrATGCTT - 18660 
-VSNCCKRR*QKQAQECVLML 

- *ATVAREGNRNKHKNACLCL 

KQLLQEKVTETSTRMRAYA* 
18661 - AACAAGCAGCATAGCACATGCAGCAATTGCCATAATACCAAGAGTAAATGGCAAGAAAGC - 18720 
-NKQHSTCSNCHNTKSKWQES 
-TSSIAKAAIAIIPRVNGKKA 
QAA*HMQQLP*YQE*MARKH 
18721 - ATTCTCGTAAACAAAGAAAAACAGTGACCACTGTGTACTTTGAACAAGAATCAATAGTGA - 18780 
-IJjVNKEKQ*PLCTIiNKNQ* * 

- FS*TKKNSDHCVL- l TRINSD 

SRKQRKTVTTVYFEQESIVM 
18781 - TGTCAAGAAAGTTAAAAGCATCCAATGATGAGTGCCCTTAACAATTTTCTTGAACTTACC - 18B40 
-CQES*KHPMMSALNNFLELT 
-VKKVKSIQ* "VPLTIFLNLP 
SRKLKASNDECP*QFS*TYL 
18841 - TTGGAAGGTAACACCAGAGCATTGTCTAACAACATCAAATGGTGTAAACTCATCTTCTAA - 18900 
-LEGNTRALSNHIKWCKLIF* 
-WKVTPEHCLTTSNGVNSSSK 
GR*HQSIV*QHQMV*THLLK 
18901 - AATAGVGCTACCAAGGATAGTACGACCArTCATACCATTCTGCAGCAGCTCTTTCAAAGC - 18960 
-NSATKD3TTIHTILQQLFQS 

- IVLPRIVRPFIPFCSSSFKA 

*CYQG*YDHSYHSAAAIiSKQ 
13961 - AGCACACATATCTAAGACGGCAATTCCTGTTTGAGCAGAAAGAGGTCCCAATATGTCAAC - 19020 
-STHI^DGNSCLSRKRSQYVN 
-AHISKTAIPV*AERGPNMST 
HTYLRRQFLFEQKEVPICQH 
19021 - ATGATCTTGTGTCAAAGGTTCATAGTTGIACITCATTGCCACAAGGTTAAAGTCATTCAA - 19080 
-MILCQRFIVVLHCHKVKVIQ 

- ^SCVKGS^LYFIATRLKSFK 

DLVSKVHSCTSLPQG*SHSK 
19081 - AGTAGTGGTGAATCTATTAAGAAACCACCTATCACCATTGATAACAGCAGCATA'CAGCCA - 19140 
-SSGESIKKPPITIDNSSIQP 
-VVVNLLRNHLSPLITAAYSH 
*W*IY'*ETTYHH**QQHTAM 
19141 - TGCCAAAACATTrAATGTTATGGTTGTGTCTGTACCTGCAGCCTGTGCAGTTTGTCTGTC - 19200 
-CQNI*CYGCVCTCSLCSLSV 
-AKTFNVMVVSVPAACAVCLS 
PKHLMLWLCLYLQPVQFVCQ 
19201 - AACAAATGGACCATAGAATTTACCTTCTAAGTCAGTACCAGCGTGTACTCCTGTTGGAAG - 19260 
-NKWTIEFTF*VSTSVYSCWK 
-TNGP*NLPSKSVPACTPVGS 
QMDHRIYIiLSQYQRVLLLEA 
192 61 - CTCCATATGATGCATATAGCAGAAAGACACGCAATCATAATCAATGTTAAAACCAACACT - 19320 
-LHMMHI AERHAI I INVKTNT 
-SI*CI*QKDTQS*SMLKPTL 
PYDAYSRKTRNHNQC*NQHY 
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19321 - ACCACATGATCCATTAAGGAAAGAACCTTTAATGGTATGATTAGGTCTCATGGCACACTG - 19380 
-TT*SIKERTFNGMIRSHGTL 
-PHDPLRKEPLMV*LGLMAH* 
HMIH*GKNIi*WYD*VSWHTD 
19381 - ATAArtCACCAGATGGTGAACCATTGTAGCATGCTAGAACTGAAAATGTTTGACCAGGTTG - 19440 
-INTRW*TIVAC*N*KCLTRL 
-* TPDGEPL*HARTENV*PGW 
KHQMVNHCSMLELKMFDQVG 
19441 - GATACGGACAAATTTATACTTGGGTGTCTTAGGGTTAGAAGTATCAACTTTAAGCCTAAG - 19500 
-DTDKFIL3C LRVRS INFKPK 
-IRTNLYLGVLGIiEVSTLSLS 
Y G Q I Y T W V S * G * K Y Q Ii * A •'■ A 
19501 - CAGACAATTTTGCATAGAATGGCCAATAACACGAAGTTGAACATTGCCAGCCTGAACAAG - 19560 
-QTILHRMANNTKLN IASLNK 
-RQFCIEWPITRS*TIiPA*TR 
DNFA*NGQ*HEVEHCQPEQE 
19561 - AAAGCTATGGTTGGATTTGCGAATGAGCAGATCTTCATAGTTAGGATTAAGCATGTCTTC - 19620 
-KAMVGFANEQIFIVRIKHVF 
-KLWLDLRMSRSS*LGLSMSS 
SYGWICE*ADLHS* D * A C L L 
19621 - TGCTGTGCAAATGACATGTCTTGGACAGTATACTGTGTCATCCAACCACAATCCATTAAG - 19680 
-CCANDMSWTVYCVIQPQSIK 
-AVQMTCLGQYTVSSNHNPLR 
LCK*HVLDSILCHPTTIH*E 
19681 - AGTTGTAGTTCCACAGGTTACTTGTACCATGCACCCTTCAACTTTGCCTGACGGGAATGC - 19740 
-SCSSTGYLYHAPFNFA^REC 
-VVVPQVTCTMHPSTLPDGNA 
L * FHRLLVPCTLQLCITGMP 
19741 - CATTTTCCTAAAACCACTCTGCAGAACAGCAGAAGTGATTGATGTCTGTGGTGGTTGGTA - 19800 
-HFFKTTLQNSRSD*CLWWLV 

- IFLKPLCRTAEVIDVCGGW* 

FS *NHSAEQQK* LMSVVVGR 
19801 - GAGAACATCAGCACCTGAGTTGCTAAAGTCATTTAGAGCCTTTGCTAAGTGGCAGCAAGC - 19860 
-ENIST*VAKVI*SLC*VAAS 
-RTSAPELLKSFRAFAKWQQA 
EHQHLSC*SHLEPLLSGSKL 
19861 - TGCTTCACGATAGCTGGTAGTATCTAAGGCTCCACTGAAATACTTGTACTTGTTATATAG - 19920 
-CFTIAGSI*GSrEILVLVI* 
-ASR^LVVSKAPLKYLYLLYR 
LHDSW*YLRLH*NTCTCYIE 
19921 - AGCAAGATACCTGTTATACTGTGTAAGTGGCAACAGTGTCTCGCTACGCAATTTTAGGTA - 19980 
-SKIPVILCKWQQCLATQ'F*V 
-ARYL1YCVSGNSVSLRNFRY 
QDTCYTV*VATVSRYAILGT 
19981 - CATTTCCTTGTTGAGCAAAAAGGTACACAAAGCAGCCTCCTCGAAGGTACTAAATGTAAC - 20040 
-HFLVEQKGTQSSLLEGTKCN 

- ISLLSKKVHKAASSKVLNVT 

FPC*AKRYTKQPPRRY*M*L 
20041 - TCCATTAAACATGACTCTTTTCCTAAGATAGTTGTTAAAGAACCAATGGCAGTGCTTCAG - 20100 

- S I KH DS FP KIVVKE PMAVLQ 
-PLNMTLFLR*LLKNQWQCFR 

H*T*LFS*DSC*RTNGSASE 
20101 - AGAAATACAGAATACATAGATTGCTGTTATCCAAAAAGGCACAATAGGAGAAAACATGGC - 20160 
-RNTEYIDCCYPKRHNRRKHG 
-EIQNT^IAVIQKGTIGENMA 

KYRIHRLLLSKKAQ*EK TKQ 
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20161 - MACCATTGAAGGTGAGCCAAGMTGAAACATCATTGGTGAMTAGAATGTCAAGTACAA - 20220 
-KPLKVSQE*NIIGEIEC QVQ 
-NH*R*AKNETSIiVK*NVKYK 
TIEGEPRMKHHW*NRMSSTS 
20221 - GTARAAGACT3AGTAGAC?CCCGGCAGAAAGCTGTAAGCTGGTACCAGftCA3AGTATAGT - 20280 
-VKD*VDSRQKAVSWYQTEYS 
~*KTE*TPGRKL*AGTRQSIV 
KRLSRLPAESCKLVPDRV** 
20281 - GAAAGACATCAAAAACAAAAGTGCATTAGCAGCAACAACATGGTTGTACTCACCAAAAAC - 20340 
-ERHQKQKCISSNNMVVLTKN 
-KDIKNKSALAATTWLYSPKT 
KTSKTKVH*QQQHGCTHQKH 
20341 ~ ACGTCTGAATTTCATAAAGTAGTAGGCAGCACAAGTCACCAATATGGCAATAATACCACC - 20400 
-TSEFHKVVGSTSHQYGNNTT 
-RLNFIK**AAQVTNMAIIPP 
V*IS*SSRQHKSPIWQ*YHQ 
20401 - AGCCACTACTGAAGCAGACACATCTAAAGCACCCACAGGTTGCACAAGAGGAGTAAAGAT - 204 60 
-SHY^SRHI^STHRLHKRSKD 
-ATTEADTSKAPTGCTRGVKM 
PLLKQTHLKHPQVAQEE*RC 
20461 - GTTAGCTATGAGATTCATCGCATCAACACCACAGAAAACTCCTGATAGAGCTCTGTAATG - 20520 
-VSYEIHR1NTTENS-**SSVM 
-LAMRFIASTPQKTPDRAL*C 
*L*DSSHQHHRKLLIELCNA 
20521 - CTCATTATTAAGAACCCATCTACCACTGGTAGA , 17-LGGCAAATACCTACTTCTGACCTTTC - 20580 
-LIIKNPSTTGR*ANTYF*PF 
-SLLRTHLPLVDRQIPTSDLS 
HY*EPIYHW*IGKYLLLTFR 
20581 - GCATGTACCATGTCTACAGTACTCAGCATCAAAAGTTGTTACTACTCTAACAGAACCCTC - 20640 
-ACTMSTVLSIKSCYYSNRTL 
-HVPCLQYSASKVVTTLTEPS 
- MYHVYSTQHQKLLLL*QNPP 
ICTGTATGATGGAACC! 
L Y D G T : 
C M M E P 



-TKLTIRMRTL*QISVITIWH 
■RSSL*EIEPSSKLVS*QYGT 



-RFAHS ILKNCTLSSKNASRG 
-GLPIASLKIVHSAARTQAEV 

V C P * HP*KLY?QQQERKQR* 
20821 - AGCAAAATCACTATACTCAATGAGTTTGGAAGGTGTGTAGCAAATGTTGCCAACAGCACT - 20880 
-SKITILNEKGRCVANVANST 
-AKSLYSMSLEGV*QMLPTAL 

QNHYTQ*VWKVCSKCCQQH* 
20881 - AAAAACACGAGGTAGAAAATGCAAGAAGTCACCATTGATTGCTCTCAGCACAGTACCCGG - 20940 
-KNTR*KMQEVTIDCSQHSTR 
-KTRGRKCKKSPLIALSTVPG 

KHEVENARSHH* LLSAQYPV 
20941 - TAAGCCAGGCACTATGAAACCAATCTCTCTTGTAATGATAGCAGCTACTACAGGGCAGCT - 21000 
-*ARHYETNLSCNDSSYYRAA 
-KPGTMKPISLVMIAATTGQL 

SQAL*KQSLL 1t * *QLLQGSF 
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21001 - TTTGTCftTTTTTGTATGAACCACCACGCTGGCTAaACCATGCGTCAAAACCAGCATGTTT - 21060 
-FVIFV*TTTLAKPCVKTSMF 
-LSFLYEPPRWLNHASKPACL 
CHFCMNHHAG* TMRQNQHVY 
21051 - ATTTGCAAAACAATCATCAGTAGAAATGATGTCACGAGTGACACCATCCTGAATGGCTTT - 21120 
-1CKTIISR NDVTSDTILNGF 
-FAKQSSVEMMSRVTPS*MAIi 
LQNNHQ*K*CHE*HHPEWLC 
21121 - GTAACCAATGATTTCATTTGTGTAACCATCATGGATTGACAATGTATGTACTGGCATAAC - 21180 
-VTNDFICVTIMD*QCMYWHN 
-*PMISFV*PSWIDNVCTGIT 
NQ *FHLCNHHGLTMYVLA*R 
21181 - GATATAACAAACCAATGCAGCAAGMCGCACAATftATGTGGCCTTAAGCATAAGTTTAAA - 2124 0 

- D I TNQCS K N A Q * CGLKHKFK 

- I^QTNAARTHNNVALSISLK 

YNKPMQQERTIMWP*A*V*N 
21241 - ACAAGTACTAACAATCTTACCACCCT-GAGTGAGATTTTAGTAGTTATGACATTGACAAC - 21300 
-TSTNNLTTLE i 'DFSSYDIDN 
-QVLTILPPLSEILVVMTLTT 
KY*QSYHP*VRF J - i L*H i QP 
21301 - CTGTCTAGTTGTAGCACAAGTTAGTGTAAAAGGTATGTTGTTCTTCTTGGCAGCAGTACG - 21360 
-LSSCSTS*CKRYVVLLGSST 
-CLVVAQVSVKGMLFFLAAVR 
V A L*HKLV*KVCCSSWQQYE 
21361 - AATTTGTTTACGCAGCTGTTCAGATAAAGACATGTAGTCTTTTACATTCCAGATGAGTGA - 21420 
-NLFTQLFR*RHVVFYIPDE* 
-ICLRSCSDKDM*SFTFQMSE 
FVYAAVQIKTCSLLHSR*VK 
21421 - AACATTGTGACTTTTTGCTACTTGGGCATTGATATGCCTTGCATTACAGTCAATACATGC - 21480 
-NIVTFCYLGIDMPCITVNTC 
-TL*LFATNALICLALQSIHA 
HCDFLLLGH*YALHYSQYMR 
21481 - GCCAAGATCTCTGGGCGTCATGTTTTCAACCTTATTATAGGTGAGCATGAAATTGTTACA - 21540 
-AKISGRHVFNLIIGEHEIVT 
-PRSLGVMFSrLL*VSMKLLQ 
QDLWASCFQPYYR*A*NCYN 
21541 - ACTGTCACCTGTCACTTCTAAGTCAGAGTGATGTGAAAGTTTGAGACATTCAATAACATC - 21600 
-TVTCHF*VRVM*KFETFNNI 
-LSPVTSKSE*CESLRHSITS 
CHLSLLSQSDVKV*DIQ*HP 
21601 - CTTTGTGTCAACATCGGTATCAACAACACCTTGTCGGGCAGCTGACACGAATGTAGAAAG - 216S0 
-LCVNIGINNTLSGS*HECRK 
-FVSTSVSTTPCRAADTNVER 
LCQHRYQQHLVGQLTRM*KG 
21661 - GACACCATCTAAAGCTACACCCTTTGCTAACTCGCTGTGAGCTGTAGCAACAAGTGCCTT - 21720 
-Dri*SYTLC*LAVSCSNKCL 
-TPSKATPFANSL*AVATSAL 
HHLKLHPLLTRCEL*QQVP* 
21721 - AAGTTTTTCCATAGGAACACTAAAAGTTGCTGAAAAGGTGTCGACATAAGCATCAAACAT - 21780 
-KFFHRNTKSC*KGVD1SIKH 
-SFSIGTLKVAEKVST*ASNI 
' E H * K 

:ttcagtactatct 

~lngnfstisnv*ykslvkqq 
-ltetsvlsptfdtrawssnr 
* rklqyylqrliqelgqate 
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21841 - AATAGGTTGGCACATCAGCTGACTGTAGTACACAGAAGCAGACTTAGAAGCAGACTCGTC - 21900 
-NRLAHQLTVVHRSRLRSRLV 
-IGWHIS*L*YTEADLEADSS 
*VGTSADCSTQKQT*KQTRR 

21901 - GCATTTGGACTTGCCATCAAAAACTATGACATTAATAGGCAGTGAACCTTTAGTGTTGTT - 21960 

- A F G L A I KNYDINRQ* T F S V V 
-HLDLPSKTMTLIGSEPLVLL 

IWTCHQKL*H**AVNL*CC* 
:taaattg; 
* I D 

-ALKLSKLTKWESGCLS*VF* 
ISNCLN*QNGRADVSHRSFD 
22021 - ACCAGCCTTGTCAAAGTAGAGGTGAAGCGCGCCATTTTTCACAGCAACACTATCAACAAT - 22080 
-TSLVKVEVKRAIFHSNTINN 
-PALSK*R*SAPFFTATLSTI 
QPCQSRGEARHFSQQHYQQY 
22081 - ATACGATGACTGGTCAGTAGGGTTGATXGGTCTTTTAAACTGGAGTGACAAATCACGAGC - 22140 
-IR*LVSRVDWSFKLE*QITS 
-YDDWSVGLIGLLNWSDKSRA 
TMTGQ*G*LVF*TGVTNHEQ 
22141 - AACTTCATCACTAATGAATGTACTACCAGTGCAAAATGTGTCACAATTGAGACAATTCCA - 22200 
-NFITNECTTSAKCVTIETIP 
-TSSLMNVLPVQNVSQLRQFQ 
I> H H * *MYYQCKMCHN* DNSN 
22201 - ATTGTGAGTCTTGCAGAAGCCACGGCCTCCATTTGCATAGACATAGAAAGATCTCTTCAT - 22260 
-IVSLAEATASrCIDIERSLH 
-L*VLQKPRPPFA*T*KDLFM 
CE5CRSHGLHLHRHRKISSC 
22261 - GCCATTAACAATAGTTGTACACTCAACGCGTGTGGCACGATTGCGCTTATAGCACATCAT - 22320 
-AINNSCTLNACGTIALIAHH 
-PLTIVVKSTRVARLRL*HIM 
H * Q * LYTQRVWHDCAYSTSC 
22321 - GCAAGTCGAAGAGGTGCAACCATCCATGATATGAACATAGCTCTTCCATATGTAGTAGAA - 22380 
-ASRRGATIHDMNIALPYVVE 
-QVEEVQPSMI*T*LFHM**K 

- KSKRCNHP*YEHSSSICSRK 

22381 - AGAAGCAAAGAAGATGTACATCCTAACCATTGCAGAAACGGGTGCCATTTGTACAATACr - 22440 
-RSKEDVHPNHCRNGCHLYNT 
-EAKKMYILTIAETGAICTIL 
KQRRCTS *PLQKRVPFVQY* 
22441 - AATGATAAACCACATGAGCCAAGAATTGCTGATGAAATGACTAGCAAAATAGCCAAAGAA - 22500 
-NDKPHEPRIADEMTSKIAKE 
-MINHMSQEIjIiMK*LAK j 'PKN 
**TT*AKNC**ND*QNSQRT 
22501 - CACCTGCATTATAGCTGAAAGACCTAATAAATAAAAGAATTTTGTGAACAACATATATGC - 22560 
-HLHYS*KT* *IKEFCEQHIC 
-TCIIAERPNK*KNFVNNIYA 
PAL*LKDLINKRIL*TTYMP 
22561 - CAAAACCCACTCAGCGGCCAGACCTAAAATTGTCAAGTCTAGCTTGTACGAXGAAATCGT - 22620 
-QNPLSGQT*NCQV*LVR*NR 
-KTHSAARPKIVKSSLYDEIV 
KPTQRPDLKLSSLACTMKSS 
2 2 521 - CACCTGAATGGTTTCAAGAGCTGGATAAGAATCAAGGGAGTCTAATCCACTTAAACAAAT - 2268 0 

- H L N G FKSW IRIKGV * S T * T N 
-T*MVSRAG*ESRESNPLKQM 

PEWFQELDKNQGSLIHLNKC 
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-L QGKEPSQKSIVVTLDELRY 
CKEKNLHRNP***R*TN*DT 
22741 - CAATTCTCTAACGCCATTACRMAAGAAGGAGCACCAAAATTAGATAAGAGTACfiCCAAA - 22800 
-QFSNAITIRRSTKIR*EYTK 
-NSLTPLQ*EGAPKLDKSTPK 
I L * R K Y N K K E H Q N •• I R V II Q K 
22 801 - AGCAGCAGTTACACAGATTAGAGAACCTAAGCAAATACTTAAeAACAATAGCCACATAGC - 228 60 
-SSSYTD*RT*ANT*QQ* PHS 
-AAVTQIREPKQILNNNSHIA 
QQLHRLENLSKYLTTIAT*R 
22861 - GATTGTGAACAATTTAGAAAATTTGGGTGACTTCACATAATTAATGCCGGCATCCAAACA - 22920 
-DCEQFRKFG* LHI I N A G I QT 

- IVNNLENLGDFT*LMPASKH 

L i TI*KIWVTSHN*CRHPNI 
22921 - TAATTTAGCAACACTCTTAACACTATTTTTAGCAATAGTTGTAGGTAGTGAAGCTCTAAT - 22980 

- * FSNTLNTIFSNSCR**SSN 
-NLATLLTLFLAIVVGSEALI 



IGTFSKSTQLEQ*CKHIR 

LVLLVKVHNWNNNVNT*G 
I W Y F * *KYTIGTIM*THKA 
:TTAGCGCAATTTGATGTTGTAAT r 
, S A I * C C N C 

trcanllaqfdvvi 
:hvvlis*rnlml*l 

raTGACATAAGCCAAAJ 
? D r S Q N 
-ACPKNGLT*AKILLQGTLLI 
LVLRMV*HKPKFYSKEHY*L 
2 3161 - TGCAGCAATACCATGAGTGGCAATTGTTTTTAAACCTAAGGCTAGTGAAAGCTCATTAGG - 23220 
-CSNTMSGNCF*T*G* * K L I R 
-AAIP*VAIVFKPKASESSI>G 
QQYHEWQLFLNLRLVKAH*V 
23221 - TTTCTTAATGGTAATGCTTGTGTTTTCCACATAAGCAGCCATAAGATCCTCATGACCTAA - 232B0 
-FLNGNACVFHISSHKILMT* 
-FLMVMLVFST*AAIRSS' t PN 



-SCVTLTPSSDGLSMTliPTTS 
LVLL*HLHLMV*V*HCLQIiR 
23341 - GGTAGTTTTCACGTCACACTCTATGACTTCCTTCTGTATGGTAGGATTTTCCACTACTTC - 23400 
-GSFHVTLYDFLLYGRIFHYF 
-VVFTSHSMTSFCMVGFSTTS 
*FSRHTL*LPSVW*DFPLLL 
23401 - TTCAGAGGTGGGTTGTTGACTTTCACAAGCAAGATTGTCCArTCCTTGTGTGTCTTCTAC - 23460 
-FRGGLLTFTSKIVHSLCVFY 
~SEVGC*LSQARLSIPCVSST 
QRWYVDFHKQDCPFLVCLLL 



ARTSNEFEVSTGFVLQRQRK 
PELQMNLKYLLALYSKDNYN 
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2 3521 - flCACCAAGTGTTTGGTTTGAACGTTGTCTTGGTTGTAGGCTGGTTAATGTGCCAAACAAT - 23580 
-TPSVWFBRCLGCSLVNVPNN 
-HQVFGLNVVLVVAHLMCQTI 
TKC1V*TLSWL*EG*CAKQL 
2 3581 - TGGCTTATGCAGTAATTTAGCACCTTTCTTGAAACTCGCTGAATAGTGTCTATAGTCAAT - 23640 
-WLMQ*FSTFLETR i IVSIVN 
-GLCSHLAPFLKLA2*CL*SI 
ftYAVI*HLS*NSLNSVYSQ* 
23641 - RGCCACTACATCGCCATTCAAGTCTGGGAAGAATGTGACAGATAGCTCTCGTGAAGCTGG - 23700 
-SHYIAIQVWEECDR*LS*SW 
-ATTSPFKSGKNVTDSSREAG 
PLHRHSSLGRM*QIALVKLA 
23701 - CTTTGTGAAGCCTGTCATTTGATTTAAATCATCAGCAAATTTTGTGTTAGAACATGTGAG - 23760 
-LCEACHLI * I I SKFCVRTCE 

- FVK PV1 * FKSSANFVLEHVS 

L* SLSFDLKHQQILC*NM*V 
23761 - TTTGAAATTATCAAAACTCGCATTTGGTAATGGTTGAGTTGGTACAAGGTCTATAGGCTG - 23820 
-FEIIKTRIW*WLSWYKVYRL 
-LKLSKLAFGNG*VGTRSIGC 
*NYQNSHLVMVELVQGL*AA 
23821 - CTCTGTATAGTAAGCATTATCCTTTTTATAATACCCATCCAATTTTGGTTCAATCTCTGT - 23880 
-LCIVSIILFIIPIQFWFNLC 
-SV**ALSFL*YPSNFGSISV 
LYSKHYPFYNTHPILVQSLC 
23881 - GTAAGTAACTCCATCGAGTTTATACGACACAGGCTTGATGGTTGTAGTGTAAGATGTTTC - 23940 
-VSNSIEFIRHRLDGCSVRCF 

- + VTPSSLYDTGLMVVV*DVS 

K*LHRVYTTQA*WL*CKMFP 
23941 - CTTGTAGAAAACATCAGTCACTGGTCCTTTGTACTCTGACATCTTTGTAAGGTGAGCTCC - 24 000 
-LVENISHWSFVL*HLCKVSS 
-L*KTSVTGPIiYSDIFVR*AP 
CRKHQSLVLCTLTSL*GELR 
24001 - GTCAATACGATAGAGGGTC^CCTTAGCAGTTArATGAGTGTAATGACCACACTGATAGTT - 24060 
-VNTIEGLLSSYMSVMTTLIV 
-SIR*RVSLAVI*V**PH**L 
QYDRGSP*QLYECNDHTDSY 
24 061 - ACCAGTGTACTCATTCGCACATAAGAATGTACCTTGCTGTAATTTATACTCAGCAGGTGG - 24120 
-TSVLIRT*ECTLL J 'FILSRW 
-PVYSFAHKNVPCCNLYSAGG 
QCTHSHIRMYLAVIYTQQVV 
24121 - TGCAGACATCATAACAAAAGAAGACTCTTGTTGTACTAGATATTGTGTAGCATCACGACC - 24180 
-CRHHNKRRLLLY^ILCSITT 
-ADIITKEDSCCTRYCVASRP 
QTS*QKKTLVVLDIV*HHDH 
24181 - ACACACACATGGAATGGAAACACCTGTCTTAAGATTATCATAAGATAGAGTACCCATATA - 24240 
-THTWMGNTCIiKI I I R* STHI 
-HTHGMETPVLRLS* DRVPIY 
THMEWKHL5*DYHKIEYPYT 
24241 - CATCACAGCTTCTACACCCGTTAAGGTAGTAGTTTTCTGACCACAATGTTTACACACCAC - 24300 
-HHSFYTR*GSSFL1TMFTHH 
-ITASTPVKVVVF*PQCLHTT 
SQ1LHPLR* * FSDHNVYTPH 
24301 - ATTAAGAACTCGCTTTGCAGATTCCAAATTAGCATGCTGTAGAAGATGGGTCATAGTTTC - 24360 
-IKNSLCRFQISML*KMGHSF 
-LRTRFADSKLACCRRWVIVS 
*ELALQIPN*HAVEDGS*FL 
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24 361 - TCTGACATCACCAAGCTCGCCAACAGTTTTATraCTGTAAGCGAGTATGAGTGCACAAAA - 24420 
-SDITKLANSFITVSEYECTK 
-LTSPSSPTVLLL*ASMSAQK 
*HHQARQQFYYCKRV*VHKS 
24 421 - GTTAGCAGCATCACCAGCACGGGCTCTATAATAAGCCTCTTGAAGIGCTGGTGCATTGPA. - 24 4 80 
-VSS ITSTGS1 ISLLKCWCIE 
L A A S P A R A L * * A S * SAGALN 
*QHHQHGIi5fNKPLEVLVH*I 
24481 - TTTGACTTCAAGCTGTTGAAGTGCTAATAAAACACTAGACAAATAACAATTGTTATCAGC - 24540 
-FDFKLLKC* *NTRQITIVIS 
-LTSSC*SANKTLDK*QLLSA 
*LQAVEVLIKH*TNNNCYQP 
24541 - CCATTTAATTGAAGTTAAACCACCAACTTGAGGAAATTTCCATTTCTTTGTGTGGTTTAA - 24600 
-PFN*S*TTNLRKFPFLCVV* 
-HLIEVKPPT*GNFHFFVWFK 
I*LKLNHQLEEISISLCGLK 
24601 - AGCAGACATGTACCTACCAAGAAAACTCTCATCAAGAGTATGGTAGTACTCGAAAGCTTC - 24660 
-SRHVPTKKTLIK3MVVLE5F 
-ADMYLPRKLSSRVW*YSKAS 
QTCTYQENSHQEYGSTRKLH 
24661 - ACTACGTAGTGTGTCATCACTAGGTAGTACAAAGAAAGTCTTACCCTCATGATTTACATG - 24720 
-TT* CVITR*YKESLTLMI YM 
-LRSVSSLGSTKKVLPS*FT* 
YVVCHH*VVQRKSYPRDLHE 
24721 - AGGTTTAA1TTTTGTAACATCAGCACCATCCAAGTATGTTGGACCAAACTGCTGTCCATA - 24780 
-RFNFCNISTIQVCWTKLLS I 
-GLIFVTSAPSKYVGPNCCPY 
V*FL*HQHHPSMLDQTAVHM 
24781 - TGTCATAGACATATCCACAAGCTGTGTGTGGAGATTAGTGTTGTCCACAGTTGTGAACAC - 24840 
-CHRHIHKLCVEISVVHSCEK 
-VI DISTSCVWRLVLSTVVNT 
5*TYPQAVCGD*CCPQL*TL 
24841 ~ TTTTATAGTCTTAACCTCCCGCAGGGATAAGAGACTCTTTAGTTTGTCAAGTGAAAGAAC - 24900 
-FYSLNLPQG*ETL* F V K * K N 
- FIVLTSRRDKRLFSLSSERT 
S* PPAGIRDSLVCQVKEP 
24901 - CTCACCGTCAAGATGAAACTCGACG3GGCTCTCCAGAGTGTGGTACACAATTTTGTCACC - 24960 
-LTVKMKLDGALQSVVHNFVT 
-SPSR*NSTGLSRVWYTILSP 
HRQDETRRGSPECGTQFCHH 
24961 - ACGCTTAAGAAATTCAACACCTAACTCTGTACGCTGTCCTGAATAGGACCAATCTCTGTA - 25020 
-TLKKFNT*LCTLS*IGPISV 
-RLRNSTPNSVRCPE*DQSL* 
A*EiQHLTIiYAVLNRTNLCK 
25021 - AGAGCCAGCCAAAGAAACTGTTrCTACAAAGTGCTCCTCAGATGTCTTTGATGACGAAGT - 25080 
-RASQRNCFYKVLlRCIi**RS 
-EPAKETVSTKCSSDVFDDEV 
SQPKKLFLQSAPQMSLMTK* 
25081 - GAGGTATCCATTATATGTAGTAACAGCATCTGGTGATGATACTGACACTACGGCAGGAGC - 25140 
-EVSirCSNSIW**Y*HYGRS 
-RYPLYVVTASGDDTDTTAGA 
G I HYM* * Q H L V M I LTLRQEL 
25141 - TTTAAGAGAACGCATACAGCGCGCAGCCTCTTCAAGATTAAAACCATGXGTCACATAACC - 25200 
-FKRTHTARSLFKIKTMCHIT 
-LRERIQRAASSRLKPCVT*P 
*ENAYSAQPLQD*NHVSHNQ 
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25201 - flATTGGCArTGTGACAAGCGGCTCATTTAGAGAGTTCAGCTTCGTAATAATAGAAGCTAC - 25260 
-NWHCDKRLI*RVQLRNNRSY 
-IGIVTSGSFREFSFVIIEAT 
L A L * QAAHLESSAS •"•'* * K L Q 
25261 - ftGGCTCTTTACTAGrATAAAASAAGAArCGGACACCATAGTCAACGATGCCCTCTTGAAT - 25320 
-RLFTSIKEESDTIVNDALLN 
-GSLLV*KKNRTP*STMPS*I 
A L Y * YKRRIGHHSQRCPLEF 
25321 - TTTAATrCCTTTATACTTACGTTGGATGGTTGCCATTATGGCTCTAACATCCAIGCATAT - 25380 
-FNSFILTLDGCHYGSNIHAY 
-LIPLYLRWMVAIMALTSMHI 
- * FLYTYVGWIiPLWL^HPCI*' 
25381 - AGGCATTAATTTTCTTGTCTCTTCAGCATGAGCAAGCATTTCTCTCAAATTCCAGGATAC - 25440 
-RH*FSCLFSMSKHFSQIPGY 
-GINFLVSSA*ASISLKFQDT 
ALIFLSLQHEQAFLSNSRIQ 
25441 - AGTTCCTAGAATCTCTTCCTTAGCATTAGGTGCTTCTGAAGGTAGTACATAAAATGCAGA - 25500 
-SS*-NLFLSIRCF*R*YIKCR 
-VPRI SSLALGASEGST*NAD 
FLESLP*H*VLLKVVHKMQI 
25501 - TTTGCATTTCTTAAGAGCAGTCTTAGCTTCCTCAAGTGTATAACCAGCACATCCTTGTCC - 25560 
-FAFLKSSLSFLKCITSTSLS 
-LHFLRAVLASSSV*PAHPCP 
CIS i EQS J 'I,PQVYNQ;HILVQ 
25561 - AGGGTACGTGGTTATATACTCATCAACTGGCACTTTCTrCAAAGCTCTTGAGAGCATCTC - 25620 
-RVRGYILINWHFLQSS*EHL 
-GYVVIYSSTGTFFKALESIS 
GTWLYTHQLALSSKLLRASQ 
25621 - AGTAGTGCCACCAGCCTTTTTGGAGGGTATTACAACACAAGTGATATCACCACTAGTGAT - 25680 
-SSATSLFGGYYNTSDITTSD 
-VVPPAFLEGITTQVISPLVI 
*CHQPFWRVLQHK*YHH*** 
2 5 681 ~ AACATCACCTACCATGTAAGGTGCATCCTTCTCAAGGAAAGACATRTCTTCACCTCTAAG - 25740 
-NITYHVRCILLKERHIFTSK 
-TSPTM*GASFSRKDISSP1S 
HHLPCKVHPSQGKTYLHL*A 
25741 - CATGTTCTGAGAATCATGGTAAAGCTTACCATTGATATCAGCAAACAAGAGTAACTTATT - 25800 
-HVLRIMVKLTIDISKQE*LI 
-MF*ESW*SLP1ISANKSNLL 
CSENHGKAYH*YQQTRVTYW 
25801 - GGTAAGAAACTTAGTTTCTTCCAGTGTTGTGGTAACCTCATCAATGCAGGCCTTAATTTT - 258 SO 
-GKKLS FFQCCGNLINAGLNF 
-VRNLVSSSVVVTSSMQALIF 

* E T * FLPVLW *PHQCRP*FL 

25861 - TGGCTTCACATCGACAGGCTTCTGTACGACAGAITTCTCCTCAGTTTTGGAATCTTCTGT - 25920 
-WLHI DRLLY DRFLLSFGIFC 
-GFTSTGFCTTDFSSVLESSV 
ASHRQASVRQISPQFWNLLC 

25921 - GTTTGGTGGCTCCTCTTGTTTAGGTGC1TCCACTCTAGGCTTCAGGTTATCAAGATAATC - 25980 
-VWWLLLFRCFHSRLQVIKII 
-FGGSSCLGASTLGFRLSR*S 
LVAPLV*VLPL*ASGYQDNP 

25981 - CATGACAACCTGCTCATAAAGAGCTTTGTCATTGACTGCAATATAAACCTGTGTACGAAC - 26040 
-HDNLLIKSFVIDCNINLCTN 
-MTTCS*RALSLTAI*TCVRT 

* Q P A H K E L C H * L Q Y K P V Y E P 
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- CGTCTGCACGCACACTTGTAAAGACTGAAGTGGTTTAGCACCAAATATGCCTGCTGACAA - 
KWFSTKYAC*Q 
SGLAPNMPADN 
HQICL1TT 



L H A H 



- V C T 



L V K T 



26101 - CAATGGTGCAAGTAAGATGTCCTGTGAATTGAAATTTTCATATGCTGCCTTAftGAAGCTG - 25160 
- Q W C K * DVL*IEIFICCLKKL 
-NGASKMSCELKFSYAALRSW 
MVQVRCPVN*NFHMIiP*EAG 

26161 - GATGTCCTCACCTGCATTTAGGTTAGGTCCAACAACATGCAGACACTTCTTAGCAAGATT - 26220 



• D V I T C I 
- M S S 

C P H L a L G 



VRSNNMQTLLSKI 
F L A R L 



Q Q H A D T 



26221 - ATGTCCAGAAAGCAAACAAGACCCTCCTACTGTAAGAGGGCCATTTAGCTTAATGTAATC - 26280 



-MSRKQTR 



-CPESKQDPPTV 
VQKANKTLLL 



S Y C K R A I 



G P F S L M 
G H L A 



26281 - ATCACTCTCCTTTTGCATGGCACCATTGGTTGCCITGITGAGTGCACCTGCTACACCACC - 26340 
-ITLLLHGTIGCLVECTCYTT 
-SLSFCMAPLVALLSAPATPP 



F A W H 



V H L L H H 



26341 - ACCATGTTTCAGGTGTATGTTAGCAGCATTTACAATCACCATAGGATTAGCACTTTGTGC - 



P C F R C 



N H H R 
LAAFTITI 
Q H L Q S 



L A L C A 



26401 - CTCCTTAACGATGTCAACACATTTAA.TGGCAACATTGTCAGTAAGTTTTAAATAACCAGT - 26460 



- 1 L N D V 

- S L T M 



TFNGNIVSKF 
THLMATLSVSF 



CRFWFWLNL 



L V L Q V 



V L V L A Q 



G S I S D C 



26521 - AGTAGTATCATCCAGCCAGTCTTCCTCTTCTTCTTCCTCMCTCGAACTGTTTCAGCTGA - 26580 
-S3IIQPVFLFFFLNSNCFS* 
-VVSSSQSSSSSSSTRTVSAB 
* YHPASLPLLLPQLELFQLR 

26581 - GGCACCAAATTCGAGAGGGAGACCTTGATAATCATCCTCTGTACCGTACTCATGTTCACA - 26640 



-GTKFQRETL 
- A P N 



E G D L D 



LYRTHVHR 



26641 - GGTTTCATCAATTTCTTCTTCCTCACACTCTGCATCGTCCTCTTCTTCCTCATCTGGAGG - 26700 



INFFFLTLC 



F L I W R 



LLPHTLHRPLLPHLBG 
■ GTAAAAGGAACAATACATACGTGATGAAAAGTTTTCTTCACCAGCATCATCAAATAAGTA - 26760 



• V K G T I H T 



QYIRDEKFS 



KVFFTSIIK 



A S S N K 



N V A T 



I N T H V G K 



S R S I 



LHSTHQDQYPCW 



L V R R 



D Q K L 



26821 - TGGTTGTAAAGTCTTCACAACAGCCTCTGCTACAACACATGCAAACTCAGrAACTTCGGT - 26880 



LHNSLCYNTCK 
KVFTTASATTHA 



QQPLLQHMQTQ 



V T S V 
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26881 - ACCGGATTCMCAGTGTAGACAGAGCACTTTTCP.TTMGCACTTTGTCflRCACGTTCATC - 26940 
-TGFNSVDRALFIKHFVNTFI 
~PDSTV*TEHFSLSTLSTRSS 
RIQQCRQSTFH*ALCQHVHQ 

26941 - AftGCTCAV.TGTGATTCTCACATTCTTGTAACCTTGAACTTCCCAAACAGTATCTTCTCC - 27000 
-KLKCDSHILVTLNFPNSIFS 
-SSNVILTFL i P*TSQTVSSP 
AQM*FSHSCNLELFKQYLLQ 

27 0C1 - AAA6GTTACACCTTTAATTGGTGCACCCCCTTTTAAGCGAAAGACATTGTTTGTAGCCAG - 27050 
-KGYTFNWCTPF*AKDIVCSQ 
-KV TPLIGAPPFKRKTLFVAS 
RLHL^LVHPLLSERHCL^PV 

27061 - TAAACCAGGAGACAATGCGCAGTATTGTTCTTTGTCCTTAATCTCTAAGAGCATGAGGCC - 27120 
-*TRRCjCAVLFFVLNL*EHEA 
-KPGDNAQYCSLSLISKSMRP 



-IYTDWCADDSSICEAINGRL 
-FTQTGVPTIAPFVKLSTGVS 
LHRLVCRR*LHIi*SYQRASR 
27181 - GAGTGCTTCGAGTTCACCGTTCTTGAGAACAACCTCCTCAGAGGTAAGTACTGTGTCATG - 27240 
-ECFEFTVLENWLLRGKYCVM 
-SASSSPFLRTTSSEVSTVSC 
Vr,RVHRS*EQPPQR*VLCHV 
27241 - TGAATCACCTTCAAGAAAGGTTACTTCTTTTGGTGCCTTAAGAGGCATGAGTAGTTGCAG - 27300 
-*ITFKKGYFFWCLKRHE*LQ 
-ESPSRKVTSFGALRGMSSCS 
NHLQERLLLLVP*EA*VVAA 
27301 - CTGCTCCTTGCCACGTATACACTGACGGTAAAGTCCCTTGCTTTGAGCGATGAAGACTTC - 27360 
-LLLATYTLTVKSLALSDEDF 
-CSLPRIH*R*SPI.L*AMKTS 
APCHVYTDGKVPCFER*RLH 
27361 - ACCTAAGTTGAGTGATCGCAACTTTGCGCCAGCGATAGTGACTTGATCAATGCACATTTC - 27420 

- T * V E * SQLCASDSDLINAHF 

- PKLSDRNFAPAIVT*SMHIS 

LS*VIATLRQR* *LDQCTFR 
27421 - GAGTGCCTTGTTAACAACATCAATGAAGCATTTTACACAATCCTTGATGTTATCTGAAGC - 274130 
-ECLVNNINEAFYTILDVI*S 
SALLTTSMKHFTQSLMLSEA 

- VPC*QHQ*SILHNP*CYLKQ 

27481 - AACCTGTATTTGACCCTTGACGATGTCAAAAACACCTGTAATGAGAAATTTGAGAATCTC - 27540 
-NLYLTLDDVKNTCNEKFENL 
-TCI*PLTMSKTPVMRNLRIS 
PVFDP*RCQKHL**EI*ESP 
27541 - CCAAGCATCCTTGAGAAATTCAACTCCTGCACTAAGTTTCGCCTCAATCCATTCAAAGAT - 27600 
-PSILEKFNSCTKFRLNPFKD 
-QASLRNSTPALSFASIHSKI 
KHP*EIQLLH*VSPQSIQR i 
27601 - AGGCCTGAGTTTTTCAACAGTAGTGCCCAAAAGATTAGACAACCACTGAGAAGTCTGTTG - 27650 
-RPEFFNSSAQKIRQPLRSLL 
-GLSFSTVVPKRLDNH* EVCC 
A*VFQQ*CPKD*TTTEKSVV 
27 661 - TACAAGACCACCAGTTACATATGCCATAATAATGACACTGTTGGTGAGCAGGTCTGAAGT - 27720 
-YKTTSYICHNNDTVGEQV*S 
-TRPPVTYAIIMTLL VSRSEV 
QDHQLHMP** *HCW*AGLKY 
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• ATAAACCATGGCGTCGACAAGACGTAATGACTGTTCAGAAATACCATCAAGTATGGTGAC - 
INHGVDKT**LFRNTIKYGD 

• *TMASTRRNDCSEIPSSMVT 
K P W R R Q D V M T V Q K Y H Q V W ■• Q 
"TGCTCTTTGCAAATCAGGAA'I'TGAGrGGTTTGCTGCA'] 

CSLQIRN*VVCCI 
-AALCKSGIEWFAASSVRAKI 
LLFANQELSGLLHQVCAQKL 
27841 - TGATCTGATAACACCAGCAGCCTGTGAGGGAAAACCACACAGTGGTGTTAAAACTGATCT - 
-*SDNTSSL*GKTTQWC*N*S 
-DLITPAACEGKPHSGVKTDL 



-LLSNVPSTFYGLSLGNFIVT 
-CCPMFQAPFTGFPLVTL^LP 
VVQCSKHLLRAFPW*LYSYR 
27961 - C-CAGGACTCAACAATGGTTTTGAAAGACTTGTAATCAAGACTCTTTATAGTGTCAATAAA - 28020 
-AGLNNGFERLVIKTLYSVNK 
-QDSTMVLKDL*SRLFIVSIK 
RTQQWF*KTCNQDSL*CQ*R 
28021 - GGCACTTGTAGAAGCAGAGAAAGATGCCAAAATGATGGCAACCTCTTCATTCAAATGAAA - 28080 
-GTCRSRERCQNDGNLFIQMK 
-ALVEAEKDAKMMATSSFK*K 
HL*KQRKMPK*WQPLHSNEN 
28081 - ATCGCCAACAATGTTAATGTTAACACGTTCACGACTCAGTATCTCAAGGAGATCCTCATT - 28140 
-IANNVNVNTFTTQYLKEILI 
-SPTMLMLTRSRLSISRRSSF 
RQQC*C*HVHDSVSQGDPHS 
28141 - CAAGGTCTCCACATTGTCACCAGTAATGCCAGTATGGCCTGAGCCAATA'I'CAGCACTAGC - 28200 
-QGLHIVTSNASMA*ANISTS 
-KVSTLSPVMPVWPEPISALA 
RSPHCHQ*CQYGLSQYQH*H 
28201 - ACGAGGAACCCAGTAGGCACGCTTATTATAGCAGCCAACATAGGCAAACACACAGCCTCC - 2 82 SO 
-TRNPVGTLIIAANIGKHTAS 
-RGTQ- t ARLL*QPT*ANTQPP 
EEPSRHAYYSSQHRQTHSLQ 
28261 - AAAACATCTAGTCCTACCTCCCTTGCGGAGTCGAGTTTCAATGTTTGAGTGGTTGTGATA - 28320 
-KTSSPTSLAESSFNV*VVVI 
-KHLVLPFLRSRVSMFEWL** 
NI*SYLPCGVEFQCLSGCDN 
28321 - ATCTGCAACACTATGCTCAGGTCCAATCTCTGGGTCTTGACAGGCAGGACATGGCATTTT - 28380 
-ICNTMLRSNLWVLTGRTWHF 
- SATLCSGPISGS*QAGHGIF 
LQHYAQVQSLGLDRQDMAFS 
28381 - CACTACAGCATTAGTAGGTAGGTACCCACATGTAGTAGGTCCTTCAATAACTAAATTTTC - 28440 
-HYSISR*VPTCSRSFNN*IF 
-TTALVGRYPHVVGPSITKFS 



-SATMFTSGFQKVARLP*NFI 
-VPQCSQVAFRKSHVCHETSS 
CHNVHKWLSESRTSAMKLHR 
28501 - GCAATGATTACATTTCATCAAGGTAGACAAGTGCATATTGTTACACTCCTGTGGAGATGC - 28560 
-AMITFHQGRQVHIVTLLWRC 
-Q*LHFIKVDKCILLHSCGDA 
NDYISSR*TSAYCYTPVEMQ 
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28561 - AACAGGGTACACAGAGCGTATACGCCCCATGAflACCCTCAGTCTTTTTCTTTTCAACACG - 28620 
-NRVHRAYTPHETLSLFLFNT 
-TGYTERIRPMKPSVFFFSTR 
QGTQSVYAP*NPQSFSFQHV 
28621 - TGG1TGAATGACTTTGACTTTT3AGTTAAGAGGAAACACAAACTTTGGGCATTCCCCTTT - 28680 
-WLNDFDF*VKRKHKLWAFPF 
-G*MTLTFELRGNTNFGHSPL 
VB*L*LLS*EETQTLGIPL* 
28681 - GAAAGTGTCAAATTTCTTGGCACTCTTAATTTCGAAGGGTGTCTGGTGCTCGTAGCTCTT - 28740 
-ESVKFLGTLNFEGCLVLVAL 
-KVSNFLALLISKGVWCS* LL 
KCQISWHS*FRRVSGARSSY 
28741 - ATCAGAGCGCTCAGTGAACCAGGCAATTTCATGCTCATGGTCACGGCAGCAGTAGACACC - 28800 
-I RALSEPGNFMLMVTAAVDT 
-SERSVNQAISCSWSRQQ*TP 
QSAQ*TRQFHAHGHGSSRHL 
28801 - TCTCTTCGACTCGATGTAATCAAGTTGTTCGGAAAGAGTGCACATTGACTTGCCCGCGCG - 28860 
-SLRLDVIKLFGKSAH*LARA 
-LFDSM*SSCSERVHIDLPAR 
SSTRCNQVVRKECT1TCPRV 
28861 - TGCGAGAAAATCTTTGATGCAATCAAGAGGGTACCCATCTGGGCCACAGAAATTGTTGTC - 28920 
-CEKIFDAIKRVPIWATEIVV 
-ARKSLMQSRGYPSGPQKLLS 
RENL*CNQEGTHLGHRNCCR 
28921 - GACATAGCGAGTGACTGCACCTCCATTGAGCTCACGAGTGAGTTCACGGAGTGCACCACT - 28980 
-DIASDCTSIELTSEFTECTT 
-T*RVTAPPLSSRVSSRSAPL 
HSE*LHLH*AHE*VHGVHHC 
23 981 - GCCATGCTTAGTGTTCCAG?TTTGTTCATAATCTTCAATGGGATCAGTGCCAAGCTCGTC - 2 9040 
-AMLSVPVLFIIFNGISAKLV 

- PCLVFQFCS*SSMGSVPSSS 

HA*CSSFVHN1QWDQCQARR 
29041 - ACCTAAGX'CATAAGACTTTAGATCGAToCCATAGCTATGACCACCGGCTCCCTTATTACC - 29100 

- T * V I R L * I DAIAMTTGSLIT 

- PKS*DFRSMP*L*PPAPLLP 

LSHKTLDRCHSYDHRLPYYR 
29101 - GTTCTTACGAAGAAGAACATTGCGGTATGCAATTGGGGTTTCGCCCACATGTGGCACGAG - 291S0 
-VLTKKNIAVCNWGFAHMWHE 
-FLRRRTLRYAIGVSPTCGTS 
SYEEEHCGMQLGFRPHVARV 
29161 - TACTCCCAGTGTTATACCGCTACGACCGTACTGAATGCCGTCCATTTCTGCAACCAGCrC - 29220 
-YSQCYTATTVLNAVHFCNQL 
-TPSVIPLRPY*MPSISATSS 
LPVLYRYDRTECRPFLQPAQ 
29221 - AACGACCTTGTGGCCGTGATTGGTGCTTAAGGCATCAGAACGTTTAATGAACACATAGGG - 29280 
-NDLVAVIGA*GIRTFNEHIG 
-TTLWP*LV1KASERLMNT*G 
RPCGRDWCLRHQNV* * T H R A 
29281 - CTGTTCAAGCTGGGGCAGTACGCCTTTTTCCAGCTCTACTAGACCACAAGTGCCATTTTT - 29340 
-LFKLGQYAFFQLY*TTSAIF 
-CSSW-GSTPFSSSTRPQVPFL 
VQAGAVRLFPALLDHKCHF* 
29341 - GAGGTGTTCACGTGCCTCCGATAGGGCCTCTTCCACAGAGTCCCCGAAGCCACGCACTAG - 29400 
-EVFTCLR*GLFHRVPE&TH* 
-RCSRASDRASSTESPKPRTS 
GVHVPPIGPLPQSPRSHALA 
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29401 - CACGTCTCTAACCTGAAGGACAGGCAAACTGAGTTGGACGTGTGTTTTCTCGTTGACACC - 29460 
-HVSNLKDRQTELDVCFLVDT 
-TSLT*RTGKLSWTCVFSLTP 
RL*PEGQAN*VGRVFSR*HQ 

29461 - AAGAACAAGGCTCTCCATCTrACCrTTCGGTCACACCCGGACGAAACCTAGGTATGCTGA - 29520 
-KNKALHLTFRSHPDET*VC* 
-RTRLSILPFGHTRTKPRYAD 

- EQGSPSYLSVTPGRNLGMLM 

29521 - TGATCGACTGCAACACGGACGAAACCGTAAGCAGTCTGCAGAAGAGGGACGAGTTACTCG - 29530 

- * STATRTKP*AVCRRGTSYS 
-DRLQHGRNRKQSAEEGRVTR 

IDCNTDETVSSLQKRDELLV 
29581 - TTTCTTGTCAACGACAGXAAAATTTATTATrGTTTATACTGCGTAGGTGCACTAGGCATG - 29640 
-FLVNDSKIYYCLYCVGALGM 
-FLSTTVKFIIVYTA*VH*AC 
SCQRQ*NLLLFILRRCTRHA 
29641 - CAGCCGAGCGACAGCTACACAGATTTTAAAGTTCGTTTAGAGAACAGATCTACAAGAGAT - 29700 
-QPS DSYTDFKVRLENRSTRD 
-SRATATQILKFV*RTDLQEI 
AERQLHRF A 'SSFREQIYKRS 
29701 - CGAGGTTGGTTGGCTTTTCCTGGGTAGGTAAAAACCTAATAT - 29742 
-RGWLAFPG*VKT*YX 
-EVGWLFIjGR*KPNX 
RLVGFSWVGKNLIX 
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